Rubao Lee

Email: lee.rubao@ieee.org

About Me

I am a freelance computer scientist specializing in algorithm engineering, parallel computing, and advanced computer systems. With a Ph.D. in Computer Science (2008) from the Institute of Computing Technology, Chinese Academy of Sciences, I am passionate about developing scalable, high-performance solutions for complex computing challenges. I live in central Ohio, USA.

Honors and Awards

Software Code

Publication: Book

Data Management: Interactions with Computer Architecture and System
Xiaodong Zhang and Rubao Lee
Cambridge University Press
View on Amazon

Publication: VIP (Very Important Paper)

Software-Defined Software: A Perspective of Machine Learning-Based Software Production (PDF)
Rubao Lee, Hao Wang, Xiaodong Zhang
ICDCS 2018: 2018 IEEE 38th International Conference on Distributed Computing Systems, 1270-1275, July 2, 2018.

Publication: Full Papers (DBLP | Google Scholar)

  1. LibRTS: A Spatial Indexing Library by Ray Tracing
    Liang Geng, Rubao Lee, Xiaodong Zhang
    PPoPP 2025: Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2025.
    To appear.
  2. High-Performance Spatial Data Analytics: Systematic R&D for Scale-Out and Scale-Up Solutions from the Past to Now
    Fusheng Wang, Rubao Lee, Dejun Teng, Xiaodong Zhang, Joel Saltz
    VLDB 2024: Proceedings of the VLDB Endowment 17(12), 4507-4520, 2024.
    DOI | PDF
  3. X-TED: Massive Parallelization of Tree Edit Distance
    Dayi Fan, Rubao Lee, Xiaodong Zhang
    VLDB 2024: Proceedings of the VLDB Endowment 17(7), 1683-1696, 2024.
    DOI | PDF | Code
  4. RTScan: Efficient Scan with Ray Tracing Cores
    Yangming Lv, Kai Zhang, Ziming Wang, Xiaodong Zhang, Rubao Lee, Zhenying He, Yinan Jing, X. Sean Wang
    VLDB 2024: Proceedings of the VLDB Endowment 17(6), 1460-1472, 2024.
    DOI | PDF
  5. RayJoin: Fast and Precise Spatial Join by Ray Tracing Cores
    Liang Geng, Rubao Lee, Xiaodong Zhang
    ICS 2024: Proceedings of the 38th ACM International Conference on Supercomputing, June 4-7, 2024, Kyoto, Japan.
    DOI | PDF | Code
  6. UltraPrecise: A GPU-based Framework for Arbitrary-Precision Arithmetic in Database Systems
    Xin Li, Mengbai Xiao, Dongxiao Yu, Rubao Lee, Xiaodong Zhang
    ICDE 2024: Proceedings of the 40th IEEE International Conference on Data Engineering, May 13-17, 2024, Utrecht, Netherlands.
    DOI | PDF | Code
  7. An RDMA-enabled In-memory Computing Platform for R-tree on Clusters
    Mengbai Xiao, Hao Wang, Liang Geng, Rubao Lee, Xiaodong Zhang
    ACM TSAS: ACM Transactions on Spatial Algorithms and Systems, 8(2), 1-26, 2022.
    DOI | PDF
  8. The Art of Balance: A RateupDB Experience of Building a CPU/GPU Hybrid Database Product
    Rubao Lee, Minghong Zhou, Chi Li, Shenggang Hu, Jianping Teng, Dongyang Li, Xiaodong Zhang
    VLDB 2021: Proceedings of the VLDB Endowment 14(12), 2999-3013, 2021.
    DOI | PDF
  9. NestGPU: Nested Query Processing on GPU
    Sofoklis Floratos, Mengbai Xiao, Hao Wang, Chengxin Guo, Yuan Yuan, Rubao Lee, Xiaodong Zhang
    ICDE 2021: Proceedings of IEEE 37th International Conference on Data Engineering, 1008-1019, April 19-22, 2021, Online.
    DOI | PDF | Code
  10. Automating Incremental and Asynchronous Evaluation for Recursive Aggregate Data Processing
    Qiange Wang, Yanfeng Zhang, Hao Wang, Liang Geng, Rubao Lee, Xiaodong Zhang, Ge Yu
    SIGMOD 2020: Proceedings of the ACM SIGMOD International Conference on Management of Data, 2439-2454, June 14-19, 2020, Portland OR, USA.
    DOI | PDF | Code
  11. Catfish: Adaptive RDMA-enabled R-tree for Low Latency and High Throughput
    Mengbai Xiao, Hao Wang, Liang Geng, Rubao Lee, and Xiaodong Zhang
    ICDCS 2019: Proceedings of 39th International Conference on Distributed Computing Systems, 164-175, July 7-9, 2019, Dallas, Texas, USA.
    DOI | PDF
  12. DirectLoad: A Fast Web-Scale Index System Across Large Regional Centers
    An Qin, Mengbai Xiao, Jin Ma, Dai Tan, Rubao Lee, Xiaodong Zhang
    ICDE 2019: Proceedings of the 35th International Conference on Data Engineering, 1790-1801, April 8-11, Macau SAR, China.
    DOI | PDF
  13. SEP-graph: Finding Shortest Execution Paths for Graph Processing Under a Hybrid Framework On GPU
    Hao Wang, Liang Geng, Rubao Lee, Kaixi Hou, Yanfeng Zhang, Xiaodong Zhang
    PPoPP 2019: Proceedings of the 24th symposium on Principles and Practice of Parallel Programming, 38-52, February 16-20, 2019, Washington DC, USA.
    DOI | PDF | Code
  14. SQLoop: High Performance Iterative Processing in Data Management
    Sofoklis Floratos, Yanfeng Zhang, Yuan Yuan, Rubao Lee, Xiaodong Zhang
    ICDCS 2018: Proceedings of the IEEE 38th International Conference on Distributed Computing Systems, 1039-1051, July 2-5, 2018, Vienna, Austria.
    DOI | PDF
  15. A Low-cost Disk Solution Enabling LSM-tree to Achieve High Performance for Mixed Read/Write Workloads
    Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Yanfeng Zhang, Siyuan Ma, Xiaodong Zhang
    ACM TOS: ACM Transactions on Storage (TOS), 14(2), pp 15, 2018.
    DOI | PDF
  16. A Distributed In-memory Key-value Store System on Heterogeneous CPU-GPU Cluster
    Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Li, Xiaodong Zhang, Bingsheng He, Jiayu Hu, Bei Hua
    VLDB Journal: The International Journal on Very Large Data Bases, 26(5), pp 729-750, 2017.
    DOI | PDF
  17. LSbM-tree: Re-Enabling Buffer Caching in Data Management for Mixed Reads and Writes
    Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan Ma, Yanfeng Zhang, Xiaodong Zhang
    ICDCS 2017: Proceedings of the IEEE 37th International Conference on Distributed Computing Systems, 68-79, June 5-8, 2017, Atlanta GA, USA.
    DOI | PDF
  18. Feisu: Fast Query Execution over Heterogeneous Data Sources on Large-Scale Clusters
    An Qin, Yuan Yuan, Dai Tan, Pengyu Sun, Xiang Zhang, Hao Cao, Rubao Lee, Xiaodong Zhang
    ICDE 2017: Proceedings of the IEEE 33rd International Conference on Data Engineering, 1173-1182, April 19-22, 2017, San Diego CA, USA.
    DOI | PDF
  19. Spark-GPU: An Accelerated In-Memory Data Processing Engine on Clusters
    Yuan Yuan, Meisam F. Salmi, Yin Huai, Kaibo Wang, Rubao Lee, Xiaodong Zhang
    BigData 2016: Proceedings of 2016 IEEE International Conference on Big Data, 273-283, December 5-8, 2016, Washington DC, USA.
    DOI | PDF
  20. Internal Parallelism of Flash Memory based Solid State Drives
    Feng Chen, Binbing Hou, Rubao Lee
    ACM TOS: ACM Transactions on Storage, 12(3), August 2016.
    DOI | PDF
  21. BCC: Reducing False Aborts in Optimistic Concurrency Control with Low Cost for In-memory Databases
    Yuan Yuan, Kaibo Wang, Rubao Lee, Xiaoning Ding, Jing Xing, Spyros Blanas, Xiaodong Zhang
    VLDB 2016: Proceedings of the VLDB Endowment 9(6), 504-515, 2016.
    DOI | PDF | Code
  22. MegaKV: A Case for GPUs to Maximize the Throughput of In-Memory Key-Value Stores
    Kai Zhang, Kaibo Wang, Yuan Yuan, Lei Guo, Rubao Lee, Xiaodong Zhang
    VLDB 2015: Proceedings of the VLDB Endowment 8(11), 1226-1237, 2015.
    DOI | PDF | Code
  23. GDM: Device Memory Management for GPGPU Computing
    Kaibo Wang, Xiaoning Ding, Rubao Lee, Shinpei Kato, Xiaodong Zhang
    SIGMETRICS 2014: ACM SIGMETRICS Performance Evaluation Review 42(1), 533-545, 2014.
    DOI | PDF
  24. Major Technical Advancements in Apache Hive
    Yin Huai, Ashutosh Chauhan, Alan Gates, Gunther Hagleitner, Eric Hanson, Owen O’Malley, Jitendra Pandey, Yuan Yuan, Rubao Lee, Xiaodong Zhang
    SIGMOD 2014: Proceedings of the ACM SIGMOD International Conference on Management of Data, 1235-1246, June 22-27, 2014, Snowbird UT, USA.
    DOI | PDF | Code
  25. Concurrent Analytical Query Processing with GPUs
    Kaibo Wang, Kai Zhang, Yuan Yuan, Siyuan Ma, Rubao Lee, Xiaoning Ding, Xiaodong Zhang
    VLDB 2014: Proceedings of the VLDB Endowment 7(11), 1011-1022, 2014.
    DOI | PDF
  26. Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters
    Yin Huai, Siyuan Ma, Rubao Lee, Owen O'Malley, Xiaodong Zhang
    VLDB 2014: Proceedings of the VLDB Endowment 6(14), 1750-1761, 2013.
    DOI | PDF
  27. S-CAVE: Effective SSD Caching to Improve Virtual Machine Storage Performance
    Tian Luo, Siyuan Ma, Rubao Lee, Xiaodong Zhang, Deng Liu, Li Zhou
    PACT 2013: Proceedings of the 22nd international conference on parallel architectures and compilation techniques, 103-112, September 7-11, 2013, Edinburgh, Scotland.
    DOI | PDF
  28. Hadoop-GIS: A High-Performance Spatial Data Warehousing System over MapReduce (2024 VLDB Test of Time Award)
    Ablimit Aji, Fusheng Wang, Hoang Vo, Rubao Lee, Qiaoling Liu, Xiaodong Zhang, Joel H. Saltz
    VLDB 2013: Proceedings of the VLDB Endowment 6(11), 1009-1020, 2013.
    DOI | PDF | Code
  29. The Yin and Yang of Processing Data Warehousing Queries on GPU Devices
    Yuan Yuan, Rubao Lee, Xiaodong Zhang
    VLDB 2013: Proceedings of the VLDB Endowment 6(10), 817-828, 2013.
    DOI | PDF | Code
  30. Accelerating Pathology Image Data Cross Comparison on CPU-GPU Hybrid Systems
    Kaibo Wang, Yin Huai, Rubao Lee, Fusheng Wang, Xiaodong Zhang, Joel H. Saltz
    VLDB 2012: Proceedings of the VLDB Endowment 5(11), 1543-1554, 2012.
    DOI | PDF
  31. hStorageDB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems
    Tian Luo, Rubao Lee, Michael Mesnier, Feng Chen, Xiaodong Zhang
    VLDB 2012: Proceedings of the VLDB Endowment 5(10), 1076-1087, 2012.
    DOI | PDF
  32. DOT: A Matrix Model for Analyzing, Optimizing and Deploying Software for Big Data Analytics in Distributed Systems
    Yin Huai, Rubao Lee, Simon Zhang, Cathy Honghui Xia, Xiaodong Zhang
    SOCC 2011: Proceedings of the 2nd ACM Symposium on Cloud Computing, 1-14, October 26-28, 2011, Cascais, Portugal.
    DOI | PDF
  33. YSmart: Yet another SQL-to-MapReduce Translator (2011 ICDCS Best Paper Award)
    Rubao Lee, Tian Luo, Yin Huai, Fusheng Wang, Yongqiang He, Xiaodong Zhang
    ICDCS 2011: Proceedings of the 31st International Conference on Distributed Computing Systems, 25-36, June 20-24, 2011, Minneapolis MN, USA.
    DOI | PDF | Code
  34. RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems
    Yongqiang He, Rubao Lee, Yin Huai, Zheng Shao, Namit Jain, Xiaodong Zhang, Zhiwei Xu
    ICDE 2011: Proceedings of the 27th International Conference on Data Engineering, 1199-1208, April 11-16, 2011, Hannover, Germany.
    DOI | PDF
  35. Essential Roles of Exploiting Internal Parallelism of Flash Memory based Solid State Drives in High-Speed Data Processing
    Feng Chen, Rubao Lee, Xiaodong Zhang
    HPCA 2011: Proceedings of IEEE 17th International Symposium on High Performance Computer Architecture, 266-277, February 12-16, 2011, San Antonio TX, USA.
    DOI | PDF
  36. MCC-DB: Minimizing Cache Conflicts in Multi-core Processors for Databases
    Rubao Lee, Xiaoning Ding, Feng Chen, Qingda Lu, Xiaodong Zhang
    VLDB 2009: Proceedings of the VLDB Endowment 2(1), 373-384, 2009.
    DOI | PDF
  37. Exploiting Stream Request Locality to Improve Query Throughput of a Data Integration System
    Rubao Lee, Zhiwei Xu
    IEEE TC: IEEE Transactions on Computers 58(10), 1356-1368, 2009.
    DOI | PDF
  38. Request Window: An Approach to Improve Throughput of RDBMS-based Data Integration System by Utilizing Data Sharing Across Concurrent Distributed Queries
    Rubao Lee, Minghong Zhou, Huaming Liao
    VLDB 2007: Proceedings of the 33rd International Conference on Very Large Data Bases, 1219-1230, September 23-27, 2007, Vienna, Austria.
    DOI | PDF