<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Research on RORO's blog</title><link>https://blog.rodolpheg.xyz/tags/research/</link><description>Recent content in Research on RORO's blog</description><generator>Hugo</generator><language>fr-fr</language><managingEditor>contact@rodolpheg.xyz (0xRo)</managingEditor><webMaster>contact@rodolpheg.xyz (0xRo)</webMaster><lastBuildDate>Wed, 08 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://blog.rodolpheg.xyz/tags/research/index.xml" rel="self" type="application/rss+xml"/><item><title>Semgrep Architecture: Comprehensive Reference</title><link>https://blog.rodolpheg.xyz/posts/semgrep-architecture/</link><pubDate>Wed, 08 Apr 2026 00:00:00 +0000</pubDate><author>contact@rodolpheg.xyz (0xRo)</author><guid>https://blog.rodolpheg.xyz/posts/semgrep-architecture/</guid><description>&lt;blockquote>
&lt;p>&lt;strong>Semgrep&lt;/strong> (Semantic Grep) is a multi-language static analysis tool that matches code by its &lt;em>structure&lt;/em> &amp;ndash; not just its text &amp;ndash; using a unified Abstract Syntax Tree and a rich pattern language. This document covers the full internal architecture from CLI entry-point to taint sink detection.&lt;/p>
&lt;/blockquote>
&lt;h2 id="table-of-contents">Table of Contents&lt;/h2>
&lt;ol start="0">
&lt;li>&lt;a href="#0-introduction">Introduction&lt;/a>
&lt;ul>
&lt;li>&lt;a href="#01-what-is-sast">What is SAST?&lt;/a>&lt;/li>
&lt;li>&lt;a href="#02-history-of-semgrep">History of Semgrep&lt;/a>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>&lt;a href="#1-high-level-architecture">High-Level Architecture&lt;/a>&lt;/li>
&lt;li>&lt;a href="#2-component-breakdown">Component Breakdown&lt;/a>&lt;/li>
&lt;li>&lt;a href="#3-the-full-analysis-pipeline">The Full Analysis Pipeline&lt;/a>&lt;/li>
&lt;li>&lt;a href="#4-target-discovery--filtering">Target Discovery &amp;amp; Filtering&lt;/a>&lt;/li>
&lt;li>&lt;a href="#5-rule-parsing--optimization">Rule Parsing &amp;amp; Optimization&lt;/a>&lt;/li>
&lt;li>&lt;a href="#6-parsing--the-universal-ast">Parsing &amp;amp; the Universal AST&lt;/a>&lt;/li>
&lt;li>&lt;a href="#7-the-matching-engine">The Matching Engine&lt;/a>&lt;/li>
&lt;li>&lt;a href="#8-the-intermediate-language-il--cfg">The Intermediate Language (IL) &amp;amp; CFG&lt;/a>&lt;/li>
&lt;li>&lt;a href="#9-taint-analysis-dataflow">Taint Analysis (Dataflow)&lt;/a>&lt;/li>
&lt;li>&lt;a href="#10-output--reporting-pipeline">Output &amp;amp; Reporting Pipeline&lt;/a>&lt;/li>
&lt;li>&lt;a href="#11-osemgrep--rpc-architecture">OSemgrep / RPC Architecture&lt;/a>&lt;/li>
&lt;li>&lt;a href="#12-key-data-structures">Key Data Structures&lt;/a>&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="-1-why">-1. Why?&lt;/h2>
&lt;p>I have been working as a DevSecOps engineer for nearly four years now. When I started, I had little exposure to tooling like SAST, SCA, or DAST. My mindset was firmly rooted in offensive security. Penetration testing was the goal, the dream.&lt;/p></description></item><item><title>Understanding Code Property Graphs</title><link>https://blog.rodolpheg.xyz/posts/understanding-code-property-graphs/</link><pubDate>Tue, 05 Aug 2025 00:00:00 +0000</pubDate><author>contact@rodolpheg.xyz (0xRo)</author><guid>https://blog.rodolpheg.xyz/posts/understanding-code-property-graphs/</guid><description>&lt;p>When I first started developing tools for source code auditing, my primary need was to track tainted data flows through complex codebases during manual code reviews. Initially, I turned to Tree-Sitter, which proved excellent for single-file analysis with its fast, incremental parsing capabilities. However, as I scaled to larger codebases with complex cross-file dependencies and data flows, Tree-Sitter&amp;rsquo;s AST-only approach became limiting. The challenge wasn&amp;rsquo;t just parsing individual files. It was understanding how data flows between functions, across modules, and through various execution paths during thorough manual security assessments.&lt;/p></description></item><item><title>Code auditing 101</title><link>https://blog.rodolpheg.xyz/posts/code-auditing--101/</link><pubDate>Sat, 02 Aug 2025 00:00:00 +0000</pubDate><author>contact@rodolpheg.xyz (0xRo)</author><guid>https://blog.rodolpheg.xyz/posts/code-auditing--101/</guid><description>&lt;h2 id="topics-covered">Topics covered&lt;/h2>
&lt;p>This post explores the evolution from manual code review to automated security testing, covering:&lt;/p>
&lt;ul>
&lt;li>The reality of manual code review and its limitations&lt;/li>
&lt;li>Understanding vulnerabilities vs weaknesses&lt;/li>
&lt;li>How SAST tools work under the hood&lt;/li>
&lt;li>Taint analysis and data flow tracking&lt;/li>
&lt;li>Sink-to-source vs source-to-sink methodologies&lt;/li>
&lt;li>Mitigation strategies: whitelisting vs blacklisting&lt;/li>
&lt;li>Dealing with false positives in practice&lt;/li>
&lt;li>Choosing and implementing SAST tools at scale&lt;/li>
&lt;li>The complementary relationship between manual and automated testing&lt;/li>
&lt;/ul>
&lt;p>It&amp;rsquo;s 3 AM. You&amp;rsquo;re on your fifth cup of coffee, eyes bloodshot, staring at line 2,847 of a 10,000-line pull request. Somewhere in this maze of curly braces and semicolons lurks a SQL injection vulnerability that could bring down your entire application. Welcome to the glamorous world of manual code review!&lt;/p></description></item></channel></rss>