Software

Software we have developed to help non-domain experts access to our algorithms

MyDataPilot

MyDataPilot is an AI-powered data science assistant that enables anyone to perform sophisticated data analysis without coding or machine learning expertise. Simply describe what you want to analyze in plain English, and MyDataPilot handles everything from data loading to generating publication-ready visualizations. Work directly with your local files - no uploading required - while maintaining complete transparency over the AI's decision-making process. Watch each step of the AI's reasoning, modify its approach as needed, and download professional Python scripts for future use. Whether you're analyzing sales trends, discovering customer insights, or cleaning messy datasets, MyDataPilot transforms complex data science tasks into simple conversations, making expert-level analysis accessible to everyone.

GenePrep

GenePrep is an automated multi-agent system that revolutionizes the preprocessing and analysis of large-scale gene expression data from GEO and TCGA databases. By simply installing the package, users can execute end-to-end workflows including data validation, trait-condition pair selection, statistical testing, and comprehensive result generation - all with minimal manual scripting. Given a dataset and trait-condition pairs, GenePrep intelligently identifies genes associated with specific traits while properly accounting for experimental conditions. Its modular agent architecture enables iterative planning, execution, and debugging, dramatically reducing preprocessing overhead while improving reproducibility. Built on research presented in "Toward a Team of AI-made Scientists for Scientific Discovery from Gene Expression Data" and the GenoTEX benchmark, GenePrep represents a significant leap forward in automating genomic data analysis and accelerating scientific discovery.

Robustar

Robustar is an interactive toolbox designed to support precise data annotation and robust vision learning through an intuitive visual interface. Unlike traditional black-box machine learning systems, Robustar empowers users to understand and improve their models through a transparent, iterative workflow. Import your trained models and test samples, then leverage influence functions to identify which training samples most impact predictions. Use integrated saliency maps to visualize exactly which image regions drive model decisions, then employ drawing tools to mask out superficial or misleading pixels. These refined annotations serve as augmented training data for continued model improvement. This human-in-the-loop approach enables researchers and practitioners to build more robust vision models by systematically identifying and correcting the features their models rely on, bridging the gap between model performance and genuine understanding of visual concepts.