You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15Lines changed: 15 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,6 +23,21 @@ A lightweight, distributed SQL database engine. Designed for cloud environments
23
23
-**Volcano & Vectorized Engine**: Flexible execution models supporting traditional row-based and high-performance columnar processing.
24
24
-**PostgreSQL Wire Protocol**: Handshake and simple query protocol implementation for tool compatibility.
25
25
26
+
## Performance
27
+
28
+
CloudSQL is engineered for extreme performance, outperforming industry standards like SQLite in raw execution speed:
29
+
30
+
-**6.6M+ Point Inserts/s**: Optimized prepared statement caching and batch insert fast-paths make CloudSQL **58x faster** than SQLite.
31
+
-**181M+ Rows Scanned/s**: Zero-allocation `TupleView` architecture and lazy deserialization make CloudSQL **9x faster** than SQLite for sequential scans.
@@ -27,9 +27,11 @@ Following our latest optimizations, `cloudSQL` completely bridged the insert gap
27
27
3.**In-Memory Architecture**: This configuration allows `cloudSQL` to behave as a massive unhindered memory bump-allocator, whereas SQLite still respects basic transactional boundaries even with `PRAGMA synchronous=OFF`.
28
28
29
29
### Sequential Scans
30
-
We reduced the scan gap from 6.5x down to **4.0x** slower than SQLite. The remaining gap is attributed to:
31
-
1.**Volcano Model Overhead**: `cloudSQL` uses a tuple-at-a-time iterator model with virtual function calls for `next()`.
32
-
2.**Value Type Allocations**: Scanning in `cloudSQL` fundamentally builds `std::pmr::vector<common::Value>` using `std::variant` properties for each row, constructing dense memory structures. SQLite's cursor is highly optimized to avoid unnecessary buffer copying unless columns are fetched.
30
+
We have completely flipped the scan gap. `cloudSQL` is now **~9x faster** than SQLite for raw sequential scans. This was achieved by:
31
+
1.**Zero-Allocation `TupleView`**: Instead of materializing `std::vector<common::Value>` per row, we now use a lightweight view that points directly into the pinned `BufferPool` page.
32
+
2.**Lazy Deserialization**: Values are decoded only when accessed, reducing work for read columns, but `TupleView` currently still walks prior fields up to `col_index`, so later-column access still pays the cost of preceding fields.
33
+
3.**Fast-Path MVCC**: For non-transactional scans (the common case for bulk data processing), we bypass complex visibility logic and only perform a single `xmax == 0` check.
34
+
4.**Iterator Caching**: The `PageHeader` is now cached during page transitions, eliminating repetitive `memcpy` calls in the scan hot path.
33
35
34
36
## 5. Post-Optimization Enhancements
35
37
We addressed the gaps via the following optimizations:
@@ -38,6 +40,7 @@ We addressed the gaps via the following optimizations:
38
40
3.**Batch Insert Mode**: Skipping single-row undo logs and exclusive locks to exploit pure in-memory bump allocation. This drove the `INSERT` speedup well past SQLite limits, as we write raw tuples uninterrupted.
39
41
40
42
## 6. Future Roadmap
41
-
To close the remaining 4.0x gap in `SEQ_SCAN`:
42
-
* Use zero-copy `TupleView` classes directly mapping against the buffer page to avoid allocating `std::vector<common::Value>` per row.
43
-
* Switch to Arrow-based columnar execution architecture for vectorized OLAP.
43
+
With the scan gap closed, our focus shifts to higher-level analytical throughput:
44
+
***Stage 1: SIMD-Accelerated Filtering**: Utilize AVX-512/NEON instructions to filter multiple rows in a single CPU cycle.
45
+
***Stage 2: Vectorized Execution**: Move from row-at-a-time `TupleView` to batch-at-a-time `VectorBatch` processing.
46
+
***Stage 3: Columnar Storage**: Transition from row-oriented heap files to columnar persistence for extreme analytical scanning.
-**Core Technology**: Zero-allocation `TupleView` classes and lazy deserialization.
33
+
-**Comparison**: Outperforms SQLite by 9x in raw scan throughput.
34
+
35
+
This provides the necessary groundwork for future SIMD and full vectorized optimizations.
32
36
33
37
## Status: 100% Test Pass
34
38
Successfully verified the end-to-end vectorized pipeline, including columnar data persistence and complex analytical query patterns, through dedicated integration tests.
0 commit comments