Phileas Hocquard

Phileas Hocquard

Senior Machine Learning Engineer

Senior Machine Learning Engineer with 6+ years of experience developing and deploying ML models at scale for recommendation, search, and NLP applications. Specialized in designing a wide range of machine learning models from classical algorithms to deep learning architectures and data orchestration that drive measurable business outcomes.

I graduated with a MSc. in Applied Computing from the University of Toronto, where I was supervised by Prof. Scott Sanner. I also obtained my BSc. in Computer Science from King's College London.

Notable Projects

Gumroad Search Project Demo

Search Discovery Case Study for Gumroad

Search Recommendations ML Ranking

Comprehensive analysis and prototype implementation of improved search functionality for Gumroad's marketplace, delivering significant improvements in search relevance and product discovery.

Search Quality Improvements:
nDCG +157.8%
Mean Weighted Precision +135.1%
Average Relevance Ranking +54.3%

Issues Identified - Lack of Revelance

  • Title Weighting Issues: Product titles underweighted, causing less relevant results to appear first
  • Lack of Fuzzy Matching: Minor misspellings resulted in no relevant items
  • Language Localization Gaps: Search ignores user language preferences and geography
  • Query Expansion Limitations: General searches miss relevant category items
  • UX Pain Points: Creator spam, lack of grouping, inconsistent ranking logic

Prototype Implementation

  • Exact match for queries surrounded by quotes
  • Two-phase search process:
    • Phase 1: Combined-field, match phrase, and fuzzy matching
    • Phase 2: Cosine batch with averaged uncompressed ColBERT embeddings with rating boosting
  • CLIP text-image embeddings as a fallback model
  • Simple Front-end sorting options: Relevance, Rating, Pricing (Low/High)
  • Empirically weighted Boosting for Reputation (client side)

Architecture

  • Containerized system with:
    • Ingestion service
    • Core ML service
    • API service
    • Elasticsearch (v8.17)
    • Frontend service (Node & ReactJS)
    • NGINX
  • Multi-level caching: front-end, NGINX, and API services

Proposed Improvements

  • Quick Wins: Field weight adjustment, fuzzy matching, language detection, query expansion
  • Mid-term: Hybrid search model, content-based recommendations, time-aware recommendations
  • Advanced Solutions: Neural collaborative filtering, Graph Neural Networks, Reinforcement Learning, RAG-Enhanced Search

Professional Experience

Founder & Machine Learning Lead
July 2023 - Present
  • Instanalyze.com (B2B): Architected an advanced social media analytics and influencer discovery platform processing 100 profiles/second, 20M+ daily events, and a 10M+ influencer database
  • Designed and implemented search and discovery systems using RAG, NER, and LLMs for Instagram influencer identification
  • Consulting: Delivered ML solutions for search engines, DeFi agents, and predictive markets for blockchain clients (Solana, Polygon)
  • Innovation: Built proprietary stack with distributed crawling and network analysis for scalable real-time insights
Technology: Kubernetes, Docker Compose, PyTorch, Ray, RAG {LangGraph, Adaflow, DSPY, Magnetic, Instructor, Pydantic}, HuggingFace, LLMs, Distributed Systems, BareMetal, Reddis, GCP, AWS
Machine Learning Lead
September 2019 - November 2022
  • Led ML R&D team of 5 engineers, delivering systems across 1,150 sites serving ~200M monthly unique users
  • Created the company's first distributed ML search engine, achieving ~16% traffic increase and ~25% CTR on over 1 billion threads
  • Built novel forum representation models using Transformers & Graph Neural Networks for cross-domain content
  • Drove ~18% YoY traffic engagement growth, supporting the company's successful IPO
  • Engineered: model benchmarking & monitoring for A/B/n testing, live model swapping for Vespa.ai, and ELT pipelines

Areas of Applied Research:

  • Entity linking, text summarization, graph & text representation learning
  • Recommender systems, search (LTR/IR/Deep), social network analysis
  • Anomaly detection & spam filtering
Technology: GCP, Kubernetes,Kubeflow, Docker, PyTorch, TensorFlow, Keras, Jenkins, Spinnaker, Helm, Terraform, gRPC, Vespa.ai
About the company: Owner of thousands of communities including RateMDs, AVSForum, RedFlagDeals - with ~110 million monthly active users (2022). Ranked in the Top 10 Social Media Platforms in North America.
Intelligence (YCombinator Backed / OpenAI partner)
Machine Learning Engineer
First Legal Natural Language Search Tool in North America
May 2018 - June 2019
  • Improved search quality with re-engineered SOTA architectures, reaching 5-13% increase in MAP
  • Built proprietary ranking models (Q&A MLMM, DeepRank-inspired), increasing critical legal case retrieval by 22% (R@50)
  • Deployed deep learning-based extractive question-answering models in production
  • Led full-stack development with Terraform, Docker, AWS, Solr, Elasticsearch, Python
Technology: AWS, Java, JavaScript, PyTorch, TensorFlow, Keras, Jenkins, Kubernetes, Solr, Elasticsearch
About the company: A legal tech startup that partnered with OpenAI and was backed by Y Combinator.
King's College London - Research Intern (Adaptive Interfaces)
August - September 2016
Supervisor: Dr. Theodora Koulouri
Bank Degroof - Banking Intern (All Offices)
July 2011

Education

University of Toronto logo University of Toronto

MSc in Applied Computing

September 2017 - June 2019

King's College London logo King's College London

BSc in Computer Science

September 2014 - July 2017

  • Artificial Intelligence
  • Agents and Multi-Agent Systems
  • Planning AI
  • Software Engineering
  • Cryptography
  • Text Search and Processing

Université Laval logo
Université Laval "Program préparatoire"

BEng in Computer Engineering (1st year)

September 2013 - July 2014

  • Mathematics: Calculus I/II/III, Vectorial Algebra
  • General Chemistry & General Biology
  • Physics I/II/III: Classical Mechanics, Waves and Particles, Electricity and Magnetism

Notable Achievements

Research Grants & Cost Savings

Awarded $30,000 for research at ROSS Intelligence (2018)
Helped VerticalScope save $300K+ annually (2019-2021)

Technical Leadership (2019-2022)

Organized & held company-wide lectures in Deep Learning & NLP (1000+ attendances) at VerticalScope

Vulnerability Disclosures (2015-18)

XSS (United Airlines, RBC), CWE-472 (Uber, goo.gl), RFI (Stanford CS Homepage)

Academic Prizes (2015-16)

Credit Suisse Prize - 1st place, 8-week group competition. O'Reilly Prize - 1st/255, 2nd year CS Programming

Events (2014-2017)

UK's 1st Major League Hackathon (won prize) / HackCambridge (qualified) / HackLondon, TEDxKCL (organizer)

Repository Access
The repository code is available upon request. Please contact me via email for access.