Work
Riley Hunter is a Lead UX Researcher who specializes in GenAI product research, development, and benchmarking. He helps teams craft & implement comprehensive mixed-methods UXR programs.
Diary Study: uncovering what makes Notes memorable for US Teens
This 5-day study explored not just which moments are most memorable for teens in Notes, but what made them stand out. We designed the study using peak-end theory and known delight indicators—such as high excitement levels, longer session durations, reactions from friends, and felt emotions—to pinpoint and isolate truly delightful teen experiences. We then triangulated these moments with survey data on the top use cases that drive teen usage. This analysis informed product and design interventions to amplify engagement with these delightful use cases through GenAI creation suggestions, GenAI visual styling tools, and hidden “Easter-egg” features to increase delight in these standout moments.
Discovery Research: Guiding Notes’ First Steps into GenAI
In this GenAI project, we collaborated with multiple Meta GenAI teams—including those behind successful GenAI plays in Stories, Reels, and Direct—to understand how Meta AI has driven increases in topline creation and usage through features like AI guidance for creators, text inputs, stickers, backgrounds, and photo restyling. We identified two primary people problems: (1) the inspiration problem: users need to see inspiring or creative Notes to motivate their own creation, and (2) the personalization problem: users need tools to style and contextualize their Notes to reflect their unique personality. Based on these findings, we recommended two GenAI plays for the H2 roadmap: (1) using Meta AI as an intelligent layer in Notes to surface personalized creation suggestions to address the inspiration problem and (2) introducing GenAI-powered contextual visual restyling to address the personalization problem.
Survey: Investigating Reductions in Teen DAU for Notes
This large-scale survey used detailed age-and-tenure segmentation (new teens, existing teens, new young adults, existing young adults, and never-users) and a MaxDiff design to quantify the use cases and motivations that drive—or inhibit—Notes usage. Our analysis revealed a clear behavioral shift: new teens engage with Notes across a wide range of conversational and expressive use cases, but as users mature in both age and tenure, their use-case scope narrows significantly. Existing teens and young adults engage with only for expressive behaviors, eventually concentrating almost exclusively on music sharing expressive use cases. This tightening of use-case scope explains declining usage patterns as teens age and highlights where product interventions are needed to maintain breadth of engagement over time.
Focus Group: Evaluating the Impact of ChatGPT’s Deep Research Tool at BCG
Designed and lead focus groups with consulting and business services teams at BCG that used The Nominal Group Technique (NGT), a structured group decision-making process that involves silent idea generation, a round-robin sharing of ideas, clarification and discussion, and a final voting or ranking phase so that teams could collectively evaluate the impact of ChatGPT’s Deep Research tool at BCG.
Data Analysis for GenAI Agent Tool & Materials Prioritization
In this project, I conducted a comprehensive data analysis across a large dataset of research records spanning multiple BCG practice areas. The goal was to determine which knowledge sources (e.g., Gartner) are referenced in the highest volume of requests and which provide the broadest value to various practice areas. This analysis informed a strategic roadmap for prioritizing the development of agentic search-and-retrieval tool: features within these agents that can locate, unlock, analyze, access to prioritized information sources. The goal was to augment the manual research report request and creation workflow, delivering targeted benefits to both research report teams and large and cross-functional practice areas within the firm.
Discovery Research: Identifying Workflows to Augment with GenAI
In this discovery research project, we set out to identify which internal workflows at BCG should be augmented by GenAI considering time spent, effort, cost, and unmet needs. Through a combination of interviews and survey work, we discovered that level-one research requests (i.e., repetitive, lower-value research projects) are among the most common and time-consuming workflows. These frequent, often requested projects emerged as prime candidates for automation, offering a clear opportunity for GenAI intervention.
We focused on how to build agentic tooling that could auto-generate these L1 reports by leveraging prioritized knowledge sources and defining an appropriate human-in-the-loop experience. Ultimately, the findings helped us outline a roadmap for building GenAI tools tailored to streamline these specific, frequently performed workflows.
A/B Testing LLM Products & Features to Maximize Impact at BCG
In this project, we ran an unmoderated diary study with various BCG teams to compare and A/B test enterprise and API-based GenAI models on their routine tasks. We focused on evaluating which models delivered the most value and efficiency gains, providing insights into cost reduction and smarter GenAI budget allocation. We also tested early feature releases and compared agentic tools embedded in traditional products (like GenAI agents in Excel versus standard Excel) to see which features truly enhanced productivity. This approach helped us pinpoint the most impactful tools and guide leadership teams on efficient spend on GenAI tooling and team-specific tool allocation.
ChatGPT Survey 2.0: Optimized Screening & Survey Management for Continuous Listening
After the success of the survey 1.0 readouts, We introduced several key improvements to our survey program. By incorporating new behavioral and demographic data, we aimed to enhance our sampling. To increase response rates, we streamlined our survey design. We also re-focused on essential metrics and eliminated unnecessary data to better meet decision-makers' needs. Additionally, we incorporated new questions based on leadership feedback to gain nuanced insights into user experiences, focusing on task types, time-saving, attempted tasks, and both actual and perceived barriers.
ChatGPT Enterprise Adoption & ROI Survey
I led the quantitive research operations for a multi-million dollar product purchasing decision for +20,000 BCGers. This involved leading sampling, data quality, data cleaning, incentive management, survey design, survey programming, reporting, and partnering with other ROI research programs for evidence-gathering. We delivered presentations to high-level, and executive leadership teams.
B.Y.O. Agent Interface Conceptual Research & Prototype Testing
I led the conceptual and evaluative research for an interface that enables BCGers to build their own conversational agents. This product enables users to connect to internal data sources and handle complex research tasks uniquely tailored to their needs. The goal of this research was to measure how intuitive the experience of building an agent felt for users. I worked with AI engineers and designers to create the feasible and intuitive interface design and led the design iteration that resulted from this prototype testing.
*deck is sanitized of names and images
ChatGPT Super User Research
To complement our quantitative (behavioral + subjective) ChatGPT adoption research and to support cross-team investigation into use cases, I led the design and execution of a mixed-methods study that leveraged usage frequency data and qualitative interviews to find what constituted a ChatGPT “super user.” We discovered a small number of users who were in the top 10% of usage frequency at BCG were engaging in use cases that pushed the limits of ChatGPT and crossed the capabilities overhang that most users cannot cross. The reason they were able to unlock these advanced capabilities, was their due to their understanding of prompting best practices. Their prompts unlocked extreme time savings and quality improvements. which caused their use cases and prompts went viral among their teams and practice areas.
R&D: Planning an Open-Source LLM Development Initiative at BCG
This strategic R&D initiative at BCG outlines a comprehensive plan to develop open-source LLMs using leading libraries like Llama 2 and Vicuna13-B and aimed to outperform API-based LLM agents for BCG-specific use cases. The project emphasized advanced techniques such as prompt engineering, Retrieval-Augmented Generation (RAG), fine-tuning, Direct Preference Optimization (DPO), and Reinforcement Learning from Human Feedback (RLHF) to ensure alignment of LLM capabilities with the nuanced demands of BCG's consulting environment.
Introducing MT-Bench LLM Evaluation Framework to Drive LLM Vendor Selection for BCG
This project evaluated LLM vendors based on their alignment with high-value use cases and prompt designs for BCG's specific consulting practice areas, using custom MT-Bench analysis question. This effort ensures that user-centric prompts guide the LLM selection process. The initiative integrates LLM based prompt evaluation and user satisfaction evaluations to censure robust monitoring and evaluation of both well-vetted and emergent use cases. This evaluation dashboard and the prompts are available in a dashboard to BCGers.
R&D: Planning BCG's Future Hybrid Recommender System & GenAI Agent Workspace
This project informed high-level senior leadership roadmapping through 2025 and million dollar investment decisions. Our research proposed and began the transformation of BCG’s search-based knowledge discovery experience for BCG's client-facing staff into an advanced recommender system that also employs advanced GenAI agents that automate research queries and tasks. Our vision was based on extensive market research on recommender system algorithms and user resonance testing Our concepts and designs were delivered as a successful part of a visioning workshop to key stakeholders.
Prioritizing Consulting Research Tasks for Automation via Data & Textual Analysis of a Research Request Database
This project used various methods of textual exploratory and seeded keyword analysis (TF-IDF, N-Grams, SpaCy, NLTK, etc) as well as descriptive statistics and frequency analyses on a database containing 3 years (~400k rows) of research requests by consulting practice area. This analysis prioritized feature development for a new LLM agent. The goal was to automate high-frequency and high-duration research tasks, enhance user productivity by reducing the time spent executing and producing research reports across consulting practice areas.
Correlation & Regression Analyses for Focused Feature Development for Navi: A Content-Recommender LLM Agent
This survey-based research enabled users to evaluate factors (i.e. LLM features) critical to the LLM experience. Using correlation and regression analyses. we connected LLM features scores with user satisfaction, adoption likelihood, and recommendation likelihood scores. By analyzing how LLM features/attributes like speed, relevancy, and output structure impact user experience, we were able to make recommendations that drove targeted feature engineering improvements to enhance the post-MVP product prior to organization-wide launch to 20,000 users.
Creating a Replicable UTAUT2-Based Survey Framework for User Insights Across LLM & GenAI Product Ecosystem
I implemented a survey framework utilizes the UTAUT2 model causal chain to gather user perceptions across products in an internal LLM and GenAI ecosystem (5+ products). The survey framework enables multiple types of data analysis, including factors affecting adoption and satisfaction. The collected insights are used strategic product development and user enablement initiatives and enables product leadership to monitor important metrics across GenAI products that serve 40,000+ users.
Maximizing the Value of Explore Product Microenvironment Reports in PathAI’s TxR Platform for BioPharma Leadership
This project addresses challenges in the Translational Research (TxR) platform's interactive reports and proposes an enhanced reporting solution. Issues include variations in quality, operational inefficiencies, and a lack of interactive visualizations for quality control (QC) data exploration.
Embracing Benchmarking: Advanced Analyses for Usage Insights & Data-Driven Decision-Making
This platform analytics and data analysis project focuses on centralizing diverse benchmarking metrics to gain comprehensive insights into user segments, user behaviors, and feature usage to prioritize & optimize platform user experiences and drive innovation for AiSight | Clinical Trials & AiSight.
Productizing (AKA Monetizing) PathAI’s TxR Contributor Network
This was an interesting project with a unique product—the contributor pathologist network. This network is seen as a strategic differentiator for our company. The recommendations were equally as interesting as we found a compelling TxR opportunity to solve the high-impact problem of increasing TxR annotation quality and diversity, while at the same time finding an unanticipated revenue opportunity. This project was complicated as it began with a variety of internal assumptions or hypotheses, which were all invalidated during this study.
Optimizing PathAI’s Clinical Trial Platform for Scale & Discovering a Product-Market-Fit
This was an incredibly successful project that aimed to optimize the CTS platform and improve internal clinical trial operational efficiency from whole slide image (WSI) ingestion to participant report release. Through use of the lead user method and the jobs to be done framework, we found a common thread and a product-market-fit opportunity that solvs a significantly underserved user segment tasked with clinical trial data monitoring and clinical trial management.
Predicting and Improving the Probability of AiSight RUO Adoption in Clinical Diagnostic Settings
As PathAI began commercializing AiSight RUO in diagnostic settings, our organization needed to gather a “complete picture” of what it will take for our company to ensure adoption and improve product adoption likelihood. I led an industry-leading mixed-methods project that uses proven technology adoption & usage models and a survey to study the complex factors that contribute to successful implementation and usage of our products.
Advancing Digital Pathology System (DPS) & Laboratory Information System (LIS) Integrations at PathAI
A lack of interoperability is one of the greatest challenges facing healthcare informatics today, and is one of the primary barriers to digital pathology system adoption in diagnostic and research settings. Structured Data Capture (SDC) is an open-source technical framework that enables the capture and exchange of standardized and structured data in interoperable data entry forms (DEFs) at the point of care.
Design Research Operations
Organization-wide education about the purpose, strategy, and goals of the design research team to maximize our efficacy
As a design research leader, I have helped my organization centralize research endeavors to align disparate, multi-disciplinary research endeavors across various teams under a centralized research team in order to improve these research endeavors methodologically to provide strategic insights to organizational leadership.
Setting Design Research Standards
Creating Organizational Understanding About Design Research Processes & Deliverables
As a design research leader, it is also my responsibility to ensure that my team delivers the best-in-class approaches and deliverables for every design research project, and also to communicate what design research is to non-researchers and what our project stakeholders can expect from our team.
Design Research Mentorship
I’m also a design research mentor outside of PathAI through ADPList. I have mentored over 25 design researchers, and have received very positive reviews, such as, “Riley is super knowledgeable about the design research space, and is an encouraging mentor. In my two sessions with him, he's given me specific, actionable, and compelling advice - just as a design researcher would in a team. Coming from similar worlds in international development, he's helped me feel excited about this field while giving me perspective on what employers are really looking for.”
An overview of past companies and projects
Amazon Robotics—Boston, MA: As a Design Researcher at Amazon Robotics, I planned and analyzed qualitative benchmark / diary studies that used GoPro cameras to record an hour in the life of fulfillment center users prior to and post feature releases and for discovery efforts. Later, I was the research and design lead of a highly prioritized internal operational analytics dashboard (screen blurred for NDA) project. We pilot tested and released an MVP that increased dashboard usage metrics across fulfillment centers in the USA and Canada (know this due to benchmarking study). This was a long-term embedded research project with a product manager and engineering team working through the discovery and the development process. I led the design through iterative design team and product team critique and QA. During development, I sat in on daily sprints and did handoffs to front-end engineers and did design QA with engineering, feature prioritization with product, user stories in JIRA, etc. Lot’s of fun projects and white paper writing at Amazon. Lot’s of MVP prototype testing, collaborating with human factors, research ops, jobs-to-be done framework, future press releases, and heuristic analyses.
LEC & HUNTER—Lima, Peru: As a design researcher & strategist at LEC & HUNTER I conducted a variety of internal design thinking workshops & workshops with customers, facilitated interview training, worked with the Ministry of Public Health (MINSA), and did creative strategy & re-design work. The MINSA work was to align public health insights with conceptual and evaluative research to inform the design and development of technology products that that aim to modify behaviors and reduce public health crises. These products’ user experiences are built based on Education Entertainment (EE) and Social and Behavior Chance (SBCC) workflows and frameworks. Workshop tools are examples and no real results are shown.
WAVELENGTHS LLC—Boston, MA: LLC I started. Designed and developed 2 smartphone applications that function as ethnographic research and photo-journaling guides. They were intended for creative strategy and design research teams who embrace ethnographic research in physical locations (e.g. academic campuses, urban development contexts, etc). I also consulted with design research agencies to improve their own proprietary research applications.
MassArt Design Research Course Instructor—Boston, MA: During my MFA at MassArt, I co-taught the Introductory Design Research course at MassArt. It was fun. We did a bunch of activities, but a particularly fun one was “the hunch workshop,” which leads students through the research, brainstorming, & iterative design process with the goal of designing the “ideal desk” for one of three student typologies. I also wrote a thesis book on the intersection of social research and human-centered design research, which hypothesized: if we create design research apps that enable design research teams to (a) choose from a toolkit of social social and design research methods, and (b) the apps guide these users through the data collection and sharing process, then we can (1) increase the frequency and ease of use of new methods, (2) increase the frequency and ease of use during documentation and results sharing, and (3) improve the overall efficiency of arriving design research teams as measured by their arrival at insights.
Social Research in Public Health—Latin America: In a past prior to my MFA, I worked as a Social (sometimes field-based) Researcher in public health settings for a variety of NGOs, and in the private sector. My teams were focused on facilitating workshops for project innovation (think ideo.org) and participatory monitoring & evaluation (M&E). We often utilized participatory research methods and design thinking methods to ensure inclusive brainstorming, planning, and innovation. We set up long-term longitudinal and monitoring & evaluation studies (i.e. quantitative and qualitative benchmarking studies), and we trained local facilitators to use these methods and develop these programs themselves. We also planned communication & media interventions to promote the theory of change and the ideas for innovation. We worked with production teams to design content and strategies. We frequently planned and implemented a variety of participatory, ethnographic, and social and behavioral change toolkits (SBCC) and methodologies and engaged in common social research approaches (e.g. interviews, focus groups, surveys, etc). I worked with many governments and their research partners for this work (including indigenous tribunal councils). I have a MA in Social Research for Public Health from Ohio University.
Travel Researcher & Journalist—Central America & Mexico: In an even more distant past, I was a travel writer in Mexico. I like to say this is where my professional research career began. The process starts like a lot of projects do today. Thinking of interesting interventions (stories), ranking interest, contacting stakeholders to interview, recording, transcribing, and synthesizing the interviews, and storytelling detailed stories with “thick descriptions” of people, their daily lives, and painting a picture of daily like in unique cultural contexts!