Arnav Garg

Machine Learning Lead at Predibase

San Francisco, California

Overview 

Arnav Garg is a Machine Learning Lead at Predibase in San Francisco, California, with a strong background in Large Language Models (LLM) and expertise in C++, TensorFlow, Python, and various other programming languages and tools. He has held key roles at prestigious companies like Atlassian and Tesla, showcasing his experience in machine learning and software engineering, and he co-founded DataRes at UCLA, demonstrating leadership and entrepreneurial skills.

Work Experience 

  • Machine Learning Lead, Senior Machine Learning Engineer

    2024 - Current

    Leading Predibase’s machine learning team. Recent work includes leading the development of our reinforcement fine-tuning offering, co-creating Turbo LoRA for efficient fine-tuning + 3x faster inference via speculative decoding, developing a synthetic data generation algorithm that beats K-shot GPT-4o with just 10 rows, building continuous LoRA training mechanisms, and co-authoring LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4.

  • Senior Machine Learning Engineer

    2023 - 2024

    I focus on applied machine learning, optimizing fine-tuning workflows, and scaling distributed training and inference for open-source LLMs. My work includes designing reliability mechanisms to make training cost-effective, efficient, and resilient—so users can focus on iteration, not infrastructure. I'm also a lead maintainer of Ludwig, an open-source, YAML-based framework for low-code multimodal deep learning. 🔗 Explore my open-source work: github.com/arnavgarg1

  • Machine Learning Engineer

    2022 - 2023

Highest quality, fastest throughput small language models in your cloud

Raised $28,450,000.00 from Felicis, Anthony Goldbloom, Yi Wang, Greylock, Ben Hamner, Factory, Varun Badhwar, Remi El-Ouazzane, Zoubin Gharamani and Sancus Ventures.

  • Machine Learning Scientist, Core Machine Learning

    2022 - 2022

    Building machine learning powered smart features for Confluence and Trello. Some of the things I was responsible for during my time at Atlassian: 1. Building models to suggest users and spaces to follow on the Confluence Home Feed 2. Content recommendations across Confluence, including general suggestions and related pages (patented) 3. Suggesting users to invite to boards and workspaces on Trello 4. Propensity modeling for Confluence editions 5. Built internal tooling to quickly test models without full frontend or backend integration.

  • Associate Machine Learning Scientist, Core Machine Learning

    2021 - 2022

  • Mentor

    2021 - 2021

  • Technical Advisor

    2021 - 2021

  • Co-Founder and President

    2018 - 2020

    I co-founded DataRes, UCLA’s first data science and machine learning organization that caters to everyone from undergraduates to PhDs. Website: https://ucladatares.com/ Facebook: https://www.facebook.com/ucladatares/ Medium: https://medium.com/@ucladatares

  • Machine Learning Scientist Intern, Core Machine Learning

    2020 - 2020

    I was part of Atlassian's Core Machine Learning (CML) team, the centralized ML group, and their first intern hire in the US. I worked on scaling feature generation across product-focused machine learning using self-supervised learning, using ideas inspired by SOTA NLP.

  • Product Manager at OpenAQ

    2020 - 2020

    I led 4 developers to work on open-source air quality data aggregation services for NASA Global Modeling and Assimilation Office (GMAO) and the World Resources Institute (WRI).

  • Fellow

    2020 - 2020

    I was a part of a group of 24 fellows across 5 countries (< 1% acceptance rate).

  • Software Engineering Intern

    2019 - 2019

    As part of Tesla's Low Voltage Controllers (Electronic Systems) team, I identified and resolved a critical flaw in the Autopilot SOC validation manufacturing process, significantly enhancing the robustness testing of Autopilot hardware (HW 2.5 and HW 3.0). I also developed a real-time dashboard to monitor and detect issues in Autopilot SOC stress test systems across Tesla’s global manufacturing network.

Tesla Motors is an electric vehicle and clean energy company that provides electric cars, solar, and renewable energy solutions.

Raised $19,374,213,101.00 from European Union, PennDOT, Australian Renewable Energy Agency and Massachusetts Clean Energy Center.

  • Software Engineering Intern, Machine Learning

    2019 - 2019

    As part of Expressive's backend and machine learning teams, I developed and productionized deep learning models (BERT, Transformers, CNNs) for tasks like sentence similarity, metaphor paraphrasing, information retrieval, and context comprehension, improving the accuracy of Expressive's virtual service agent. On the backend, I implemented a feature to bulk import knowledgebase articles from ServiceNow, significantly reducing onboarding time.

  • Software Engineer

    2018 - 2019

    As an early employee at Kona (formerly Sike Insights), a Kleiner Perkins-backed startup, I helped build a web application (now a Slack extension) for remote teams to assess EQ compatibility and provide managers with personalized insights to enhance productivity and reduce turnover. I also developed an encryption layer around DynamoDB to ensure secure handling of user data.

  • Data Science Intern

    2018 - 2018

    I was part of a global team of 25 data scientists across the company. I developed a deep neural network to predict the likelihood of users installing the Starbucks mobile app if shown Starbucks advertisements that significantly improved the Starbucks app installation conversion rate.

Articles About Arnav

Relevant Websites