diff --git a/Demos/README.md b/Demos/README.md index 55c241e..26cd9b1 100644 --- a/Demos/README.md +++ b/Demos/README.md @@ -25,6 +25,13 @@ This directory contains cookbooks demonstrating various fine-tuning techniques o **Products/SDKs**: Microsoft Foundry, Azure AI Projects SDK **What it shows**: Upload datasets, create SFT job, monitor training, deploy model, and test medical summarization +### [RFT_Math_Reasoning](RFT_Math_Reasoning/) +**Technique**: Reinforcement Fine-Tuning (RFT) +**Use Case**: Advanced mathematical reasoning and problem-solving +**Dataset**: OpenR1-Math-220k +**Products/SDKs**: Microsoft Foundry, Azure AI Projects SDK +**What it shows**: Upload RFT datasets, create grading function for mathematical reasoning, configure and launch RFT job, monitor training progress, deploy model, and test advanced mathematical problem-solving capabilities + --- *Note: Each demo includes a complete notebook, dataset, requirements, and detailed README.* diff --git a/Demos/RFT_Math_Reasoning/.env.template b/Demos/RFT_Math_Reasoning/.env.template new file mode 100644 index 0000000..0812886 --- /dev/null +++ b/Demos/RFT_Math_Reasoning/.env.template @@ -0,0 +1,5 @@ +MICROSOFT_FOUNDRY_PROJECT_ENDPOINT= +AZURE_SUBSCRIPTION_ID= +AZURE_RESOURCE_GROUP= +AZURE_AOAI_ACCOUNT= +MODEL_NAME= \ No newline at end of file diff --git a/Demos/RFT_Math_Reasoning/README.md b/Demos/RFT_Math_Reasoning/README.md new file mode 100644 index 0000000..3efc327 --- /dev/null +++ b/Demos/RFT_Math_Reasoning/README.md @@ -0,0 +1,192 @@ +# Reinforcement Fine-Tuning with OpenR1-Math-220k Dataset + +This cookbook demonstrates how to fine-tune language models using **Reinforcement Fine-Tuning (RFT)** with the OpenR1-Math-220k dataset on Microsoft Foundry. This dataset contains 220,000 advanced mathematical reasoning problems with verified step-by-step solutions, making it ideal for teaching models complex mathematical problem-solving. + +## Overview + +Reinforcement Fine-Tuning (RFT) is a powerful technique for training language models on tasks where: +- Multiple valid solution paths exist +- Correctness can be verified automatically +- Step-by-step reasoning is crucial +- The task requires structured, multi-hop thinking + +This cookbook uses the **OpenR1-Math-220k dataset**, which contains advanced mathematics problems from college-level and competition mathematics. For RFT, we use only the problem statements and final answers extracted from this dataset. The problems cover diverse mathematical domains including algebra, calculus, geometry, and number theory. + +## Dataset Information + +**Source**: [OpenR1-Math-220k on Kaggle](https://www.kaggle.com/datasets/alejopaullier/openr1-math-220k) | [Hugging Face](https://huggingface.co/datasets/open-r1/OpenR1-Math-220k) + +**Dataset Statistics**: +- **Full dataset**: 220,000 mathematical reasoning problems (7 parquet files, 1.45 GB total) +- **Source parquet**: train-00000-of-00007.parquet (13,391 problems with 27,614 verified reasoning traces) +- **Training set**: 2,661 examples (one entry per problem) +- **Validation set**: 565 examples (one entry per problem) + +> **Important**: +> - The RFT dataset contains processed data extracted from verified parquet files +> - Each training example contains **prompts only** (system + user) and a ground truth answer + + +**What the Data Contains**: +The RFT dataset consists of advanced mathematical problems with ground truth answers. Each training example includes: +- **System prompt**: Instructions for mathematical problem-solving behavior +- **Problem statement**: Complex mathematics question requiring multi-step reasoning +- **Ground truth answer**: Final answer extracted from verified solutions, used by the grader for scoring + +**Note**: The RFT format contains only prompts (system + user) and answers. The model learns to generate step-by-step reasoning through reinforcement learning, not by mimicking provided reasoning traces. + +**Problem Domains Covered**: +- Algebra and polynomial factorization +- Calculus and optimization +- Probability theory and expected values +- Geometry and trigonometry +- Number theory and combinatorics +- Linear algebra and matrix operations +- Complex numbers and abstract mathematics + +**Task Complexity**: +- Advanced multi-step mathematical reasoning +- Problems require abstract algebraic manipulation +- Multi-hop logical reasoning (typically 5-15 reasoning steps) +- Advanced mathematical concepts beyond basic arithmetic +- The model must generate detailed reasoning chains (averaging 8k tokens, complex problems up to 16k tokens) + +**Why RFT is Perfect for This Task**: +- **Multiple valid solution paths**: Many ways to solve each problem +- **Verifiable correctness**: Mathematical answers can be automatically checked +- **Exploration-based learning**: Model discovers effective reasoning strategies through reinforcement +- **Quality over memorization**: Model learns reasoning patterns, not just answers + +## What You'll Learn + +This cookbook teaches you how to: + +1. Understand the RFT dataset format (prompts + ground truth answers) +2. Set up your Microsoft Foundry environment for RFT +3. Create a grading function to evaluate mathematical reasoning quality +4. Configure and launch an RFT fine-tuning job +5. Monitor training progress and model performance +6. Deploy and test your fine-tuned mathematical reasoning model + +**Note**: The RFT-formatted dataset files (training_rft.jsonl and validation_rft.jsonl) are already provided. If you want to prepare your own dataset from the original Kaggle files, you'll need to extract problems and answers into the RFT format shown in the Dataset Format section. + +## Prerequisites + +- Azure subscription with Microsoft Foundry project (requires **Azure AI User** role) +- Python 3.9 or higher +- Familiarity with Jupyter notebooks +- Kaggle account to download the OpenR1-Math-220k dataset +- Understanding of basic mathematical concepts + +## Supported Models + +RFT in Microsoft Foundry supports the following models: + +- **o4-mini** +- **gpt-5 (PrPr)** + + +> **Note**: Model availability may vary by region. Check the [Azure OpenAI model availability](https://learn.microsoft.com/azure/ai-services/openai/concepts/models) page for current regional support. + +## Files in This Cookbook + +- **README.md**: This file - comprehensive documentation +- **requirements.txt**: Python dependencies required for the cookbook +- **rft_math_reasoning.ipynb**: Step-by-step notebook implementation +- **training_rft.jsonl**: Training dataset (2,661 examples in RFT format) +- **validation_rft.jsonl**: Validation dataset (565 examples in RFT format) + +## Quick Start + + +### 1. Install Dependencies + +```powershell +pip install -r requirements.txt +``` + +### 2. Prepare the Dataset + +You can use the training and validation dataset as is from this directory or you can prepare your own dataset. + +### 3. Set Up Environment Variables + +Copy the file `.env.template` (located in this folder), and save it as file named `.env`. Enter appropriate values for the environment variables used for the job you want to run. + +```env +# Required for RFT Fine-Tuning +MICROSOFT_FOUNDRY_PROJECT_ENDPOINT= +AZURE_SUBSCRIPTION_ID= +AZURE_RESOURCE_GROUP= +AZURE_AOAI_ACCOUNT= +MODEL_NAME= +``` + +### 4. Run the Notebook + +Open `rft_math_reasoning.ipynb` and follow the step-by-step instructions. + +## Dataset Format + +The RFT format for mathematical reasoning follows this structure: + +```json +{ + "messages": [ + { + "role": "user", + "content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nCall a scalene triangle K disguisable if there exists a triangle K′ similar to K with two shorter sides precisely as long as the two longer sides of K, respectively. Call a disguisable triangle integral if the lengths of all its sides are integers. (a) Find the side lengths of the integral disguisable triangle with the smallest possible perimeter. (b) Let K be an arbitrary integral disguisable triangle for which no smaller integral disguisable triangle similar to it exists. Prove that at least two side lengths of K are perfect squares." + } + ], + "answer": "9" +} +``` + +Each training example contains: +- **messages**: Array with exactly 1 message: + - **user message**: Combined system prompt and mathematical problem statement. The system instructions are included at the beginning of the user content, followed by the actual problem. +- **answer**: Ground truth final answer (used by the grader for verification) + +**Important**: Unlike Supervised Fine-Tuning (SFT), RFT format does NOT include an assistant message. The model learns to generate solutions through reinforcement signals from the grader, which compares the model's output against the ground truth answer field. + +## Training Configuration + +Recommended hyperparameters for RFT with mathematical reasoning: + +- **Model**: o4-mini +- **Epochs**: 2 (prevents overfitting while allowing reinforcement learning) +- **Batch Size**: 1 (limited by memory constraints with long sequence lengths) +- **Learning Rate Multiplier**: 1.0 (standard learning rate for RFT) + +These can be adjusted based on your computational resources and specific requirements. + +## Grading Mathematical Solutions + +The grading function for RFT is a deterministic Python function that evaluates: + +1. **Answer Extraction**: Extracts the final answer from the model's `\boxed{}` notation +2. **Answer Normalization**: Handles different formats (comma-separated, space-separated, LaTeX expressions) +3. **Answer Correctness**: Compares normalized model answer against ground truth + +The grader returns: +- **1.0**: Model answer matches ground truth (after normalization) +- **0.0**: Model answer does not match or is missing/malformed + +This binary reward signal guides the model to learn effective reasoning strategies that produce correct final answers. + +## Expected Outcomes + +After fine-tuning with RFT on OpenR1-Math-220k, your model should: + +- Generate detailed step-by-step mathematical reasoning chains with clear intermediate steps +- Solve college-level and competition mathematics problems across diverse domains +- Produce properly formatted answers with `\boxed{}` notation for grader verification +- Demonstrate significantly improved performance on multi-hop mathematical reasoning compared to the base model + + +## Additional Resources + +- [Azure OpenAI Fine-Tuning Documentation](https://learn.microsoft.com/azure/ai-services/openai/how-to/fine-tuning) +- [OpenR1-Math-220k dataset](https://www.kaggle.com/datasets/alejopaullier/openr1-math-220k) + +--- diff --git a/Demos/RFT_Math_Reasoning/requirements.txt b/Demos/RFT_Math_Reasoning/requirements.txt new file mode 100644 index 0000000..bd9c840 --- /dev/null +++ b/Demos/RFT_Math_Reasoning/requirements.txt @@ -0,0 +1,13 @@ +# Azure AI SDK for fine-tuning +azure-ai-projects>=2.0.0b1 + +# OpenAI SDK +openai + +# Azure Cognitive Services for model deployment +azure-identity +azure-mgmt-cognitiveservices + +# Core libraries +python-dotenv + diff --git a/Demos/RFT_Math_Reasoning/rft_math_reasoning.ipynb b/Demos/RFT_Math_Reasoning/rft_math_reasoning.ipynb new file mode 100644 index 0000000..e79a506 --- /dev/null +++ b/Demos/RFT_Math_Reasoning/rft_math_reasoning.ipynb @@ -0,0 +1,776 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Reinforcement Fine-Tuning (RFT) with OpenR1-Math-220k Dataset\n", + "\n", + "This notebook demonstrates how to fine-tune language models using **Reinforcement Fine-Tuning (RFT)** with the OpenR1-Math-220k dataset - a collection of 220,000 advanced mathematical reasoning problems with verified step-by-step solutions.\n", + "\n", + "## What You'll Learn\n", + "1. Understand reinforcement fine-tuning (RFT) and how it differs from supervised fine-tuning (SFT)\n", + "2. Define a grader/reward function for mathematical reasoning\n", + "3. Prepare and upload mathematical reasoning datasets\n", + "4. Create and configure an RFT job using the OpenAI method\n", + "5. Monitor training progress and evaluate model performance\n", + "6. Deploy and test your RFT fine-tuned model\n", + "\n", + "**Note**: Execute each cell in sequence." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Setup and Installation\n", + "\n", + "Install all required packages from requirements.txt" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Collecting azure-ai-projects>=2.0.0b1 (from -r requirements.txt (line 2))\n", + " Using cached azure_ai_projects-2.0.0b3-py3-none-any.whl.metadata (68 kB)\n", + "Collecting openai (from -r requirements.txt (line 5))\n", + " Downloading openai-2.15.0-py3-none-any.whl.metadata (29 kB)\n", + "Collecting azure-identity (from -r requirements.txt (line 8))\n", + " Using cached azure_identity-1.25.1-py3-none-any.whl.metadata (88 kB)\n", + "Collecting azure-mgmt-cognitiveservices (from -r requirements.txt (line 9))\n", + " Using cached azure_mgmt_cognitiveservices-14.1.0-py3-none-any.whl.metadata (32 kB)\n", + "Collecting python-dotenv (from -r requirements.txt (line 12))\n", + " Using cached python_dotenv-1.2.1-py3-none-any.whl.metadata (25 kB)\n", + "Collecting isodate>=0.6.1 (from azure-ai-projects>=2.0.0b1->-r requirements.txt (line 2))\n", + " Using cached isodate-0.7.2-py3-none-any.whl.metadata (11 kB)\n", + "Collecting azure-core>=1.35.0 (from azure-ai-projects>=2.0.0b1->-r requirements.txt (line 2))\n", + " Downloading azure_core-1.38.0-py3-none-any.whl.metadata (47 kB)\n", + " ---------------------------------------- 0.0/47.7 kB ? eta -:--:--\n", + " ---------------------------------------- 47.7/47.7 kB 1.2 MB/s eta 0:00:00\n", + "Collecting azure-storage-blob>=12.15.0 (from azure-ai-projects>=2.0.0b1->-r requirements.txt (line 2))\n", + " Using cached azure_storage_blob-12.28.0-py3-none-any.whl.metadata (26 kB)\n", + "Collecting anyio<5,>=3.5.0 (from openai->-r requirements.txt (line 5))\n", + " Using cached anyio-4.12.1-py3-none-any.whl.metadata (4.3 kB)\n", + "Collecting distro<2,>=1.7.0 (from openai->-r requirements.txt (line 5))\n", + " Using cached distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)\n", + "Collecting httpx<1,>=0.23.0 (from openai->-r requirements.txt (line 5))\n", + " Using cached httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)\n", + "Collecting jiter<1,>=0.10.0 (from openai->-r requirements.txt (line 5))\n", + " Using cached jiter-0.12.0-cp311-cp311-win_amd64.whl.metadata (5.3 kB)\n", + "Collecting pydantic<3,>=1.9.0 (from openai->-r requirements.txt (line 5))\n", + " Using cached pydantic-2.12.5-py3-none-any.whl.metadata (90 kB)\n", + "Collecting sniffio (from openai->-r requirements.txt (line 5))\n", + " Using cached sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)\n", + "Collecting tqdm>4 (from openai->-r requirements.txt (line 5))\n", + " Using cached tqdm-4.67.1-py3-none-any.whl.metadata (57 kB)\n", + "Requirement already satisfied: typing-extensions<5,>=4.11 in c:\\work\\amlrepos\\rftcookbook\\fine-tuning\\env\\lib\\site-packages (from openai->-r requirements.txt (line 5)) (4.15.0)\n", + "Collecting cryptography>=2.5 (from azure-identity->-r requirements.txt (line 8))\n", + " Using cached cryptography-46.0.3-cp311-abi3-win_amd64.whl.metadata (5.7 kB)\n", + "Collecting msal>=1.30.0 (from azure-identity->-r requirements.txt (line 8))\n", + " Using cached msal-1.34.0-py3-none-any.whl.metadata (11 kB)\n", + "Collecting msal-extensions>=1.2.0 (from azure-identity->-r requirements.txt (line 8))\n", + " Using cached msal_extensions-1.3.1-py3-none-any.whl.metadata (7.8 kB)\n", + "Collecting msrest>=0.7.1 (from azure-mgmt-cognitiveservices->-r requirements.txt (line 9))\n", + " Using cached msrest-0.7.1-py3-none-any.whl.metadata (21 kB)\n", + "Collecting azure-mgmt-core>=1.6.0 (from azure-mgmt-cognitiveservices->-r requirements.txt (line 9))\n", + " Using cached azure_mgmt_core-1.6.0-py3-none-any.whl.metadata (4.6 kB)\n", + "Collecting idna>=2.8 (from anyio<5,>=3.5.0->openai->-r requirements.txt (line 5))\n", + " Using cached idna-3.11-py3-none-any.whl.metadata (8.4 kB)\n", + "Collecting requests>=2.21.0 (from azure-core>=1.35.0->azure-ai-projects>=2.0.0b1->-r requirements.txt (line 2))\n", + " Using cached requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)\n", + "Collecting cffi>=2.0.0 (from cryptography>=2.5->azure-identity->-r requirements.txt (line 8))\n", + " Using cached cffi-2.0.0-cp311-cp311-win_amd64.whl.metadata (2.6 kB)\n", + "Collecting certifi (from httpx<1,>=0.23.0->openai->-r requirements.txt (line 5))\n", + " Using cached certifi-2026.1.4-py3-none-any.whl.metadata (2.5 kB)\n", + "Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai->-r requirements.txt (line 5))\n", + " Using cached httpcore-1.0.9-py3-none-any.whl.metadata (21 kB)\n", + "Collecting h11>=0.16 (from httpcore==1.*->httpx<1,>=0.23.0->openai->-r requirements.txt (line 5))\n", + " Using cached h11-0.16.0-py3-none-any.whl.metadata (8.3 kB)\n", + "Collecting PyJWT<3,>=1.0.0 (from PyJWT[crypto]<3,>=1.0.0->msal>=1.30.0->azure-identity->-r requirements.txt (line 8))\n", + " Using cached PyJWT-2.10.1-py3-none-any.whl.metadata (4.0 kB)\n", + "Collecting requests-oauthlib>=0.5.0 (from msrest>=0.7.1->azure-mgmt-cognitiveservices->-r requirements.txt (line 9))\n", + " Using cached requests_oauthlib-2.0.0-py2.py3-none-any.whl.metadata (11 kB)\n", + "Collecting annotated-types>=0.6.0 (from pydantic<3,>=1.9.0->openai->-r requirements.txt (line 5))\n", + " Using cached annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)\n", + "Collecting pydantic-core==2.41.5 (from pydantic<3,>=1.9.0->openai->-r requirements.txt (line 5))\n", + " Using cached pydantic_core-2.41.5-cp311-cp311-win_amd64.whl.metadata (7.4 kB)\n", + "Collecting typing-inspection>=0.4.2 (from pydantic<3,>=1.9.0->openai->-r requirements.txt (line 5))\n", + " Using cached typing_inspection-0.4.2-py3-none-any.whl.metadata (2.6 kB)\n", + "Requirement already satisfied: colorama in c:\\work\\amlrepos\\rftcookbook\\fine-tuning\\env\\lib\\site-packages (from tqdm>4->openai->-r requirements.txt (line 5)) (0.4.6)\n", + "Collecting pycparser (from cffi>=2.0.0->cryptography>=2.5->azure-identity->-r requirements.txt (line 8))\n", + " Using cached pycparser-2.23-py3-none-any.whl.metadata (993 bytes)\n", + "Collecting charset_normalizer<4,>=2 (from requests>=2.21.0->azure-core>=1.35.0->azure-ai-projects>=2.0.0b1->-r requirements.txt (line 2))\n", + " Using cached charset_normalizer-3.4.4-cp311-cp311-win_amd64.whl.metadata (38 kB)\n", + "Collecting urllib3<3,>=1.21.1 (from requests>=2.21.0->azure-core>=1.35.0->azure-ai-projects>=2.0.0b1->-r requirements.txt (line 2))\n", + " Using cached urllib3-2.6.3-py3-none-any.whl.metadata (6.9 kB)\n", + "Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.5.0->msrest>=0.7.1->azure-mgmt-cognitiveservices->-r requirements.txt (line 9))\n", + " Using cached oauthlib-3.3.1-py3-none-any.whl.metadata (7.9 kB)\n", + "Using cached azure_ai_projects-2.0.0b3-py3-none-any.whl (240 kB)\n", + "Downloading openai-2.15.0-py3-none-any.whl (1.1 MB)\n", + " ---------------------------------------- 0.0/1.1 MB ? eta -:--:--\n", + " --- ------------------------------------ 0.1/1.1 MB 5.1 MB/s eta 0:00:01\n", + " --------- ------------------------------ 0.3/1.1 MB 3.2 MB/s eta 0:00:01\n", + " --------------------- ------------------ 0.6/1.1 MB 4.4 MB/s eta 0:00:01\n", + " ------------------------------------- -- 1.0/1.1 MB 5.8 MB/s eta 0:00:01\n", + " ---------------------------------------- 1.1/1.1 MB 5.2 MB/s eta 0:00:00\n", + "Using cached azure_identity-1.25.1-py3-none-any.whl (191 kB)\n", + "Using cached azure_mgmt_cognitiveservices-14.1.0-py3-none-any.whl (290 kB)\n", + "Using cached python_dotenv-1.2.1-py3-none-any.whl (21 kB)\n", + "Using cached anyio-4.12.1-py3-none-any.whl (113 kB)\n", + "Downloading azure_core-1.38.0-py3-none-any.whl (217 kB)\n", + " ---------------------------------------- 0.0/217.8 kB ? eta -:--:--\n", + " --------------------------------------- 217.8/217.8 kB 13.8 MB/s eta 0:00:00\n", + "Using cached azure_mgmt_core-1.6.0-py3-none-any.whl (29 kB)\n", + "Using cached azure_storage_blob-12.28.0-py3-none-any.whl (431 kB)\n", + "Using cached cryptography-46.0.3-cp311-abi3-win_amd64.whl (3.5 MB)\n", + "Using cached distro-1.9.0-py3-none-any.whl (20 kB)\n", + "Using cached httpx-0.28.1-py3-none-any.whl (73 kB)\n", + "Using cached httpcore-1.0.9-py3-none-any.whl (78 kB)\n", + "Using cached isodate-0.7.2-py3-none-any.whl (22 kB)\n", + "Using cached jiter-0.12.0-cp311-cp311-win_amd64.whl (204 kB)\n", + "Using cached msal-1.34.0-py3-none-any.whl (116 kB)\n", + "Using cached msal_extensions-1.3.1-py3-none-any.whl (20 kB)\n", + "Using cached msrest-0.7.1-py3-none-any.whl (85 kB)\n", + "Using cached pydantic-2.12.5-py3-none-any.whl (463 kB)\n", + "Using cached pydantic_core-2.41.5-cp311-cp311-win_amd64.whl (2.0 MB)\n", + "Using cached tqdm-4.67.1-py3-none-any.whl (78 kB)\n", + "Using cached sniffio-1.3.1-py3-none-any.whl (10 kB)\n", + "Using cached annotated_types-0.7.0-py3-none-any.whl (13 kB)\n", + "Using cached certifi-2026.1.4-py3-none-any.whl (152 kB)\n", + "Using cached cffi-2.0.0-cp311-cp311-win_amd64.whl (182 kB)\n", + "Using cached idna-3.11-py3-none-any.whl (71 kB)\n", + "Using cached PyJWT-2.10.1-py3-none-any.whl (22 kB)\n", + "Using cached requests-2.32.5-py3-none-any.whl (64 kB)\n", + "Using cached requests_oauthlib-2.0.0-py2.py3-none-any.whl (24 kB)\n", + "Using cached typing_inspection-0.4.2-py3-none-any.whl (14 kB)\n", + "Using cached charset_normalizer-3.4.4-cp311-cp311-win_amd64.whl (106 kB)\n", + "Using cached h11-0.16.0-py3-none-any.whl (37 kB)\n", + "Using cached oauthlib-3.3.1-py3-none-any.whl (160 kB)\n", + "Using cached urllib3-2.6.3-py3-none-any.whl (131 kB)\n", + "Using cached pycparser-2.23-py3-none-any.whl (118 kB)\n", + "Installing collected packages: urllib3, typing-inspection, tqdm, sniffio, python-dotenv, PyJWT, pydantic-core, pycparser, oauthlib, jiter, isodate, idna, h11, distro, charset_normalizer, certifi, annotated-types, requests, pydantic, httpcore, cffi, anyio, requests-oauthlib, httpx, cryptography, azure-core, openai, msrest, azure-storage-blob, azure-mgmt-core, msal, azure-mgmt-cognitiveservices, msal-extensions, azure-identity, azure-ai-projects\n", + "Successfully installed PyJWT-2.10.1 annotated-types-0.7.0 anyio-4.12.1 azure-ai-projects-2.0.0b3 azure-core-1.38.0 azure-identity-1.25.1 azure-mgmt-cognitiveservices-14.1.0 azure-mgmt-core-1.6.0 azure-storage-blob-12.28.0 certifi-2026.1.4 cffi-2.0.0 charset_normalizer-3.4.4 cryptography-46.0.3 distro-1.9.0 h11-0.16.0 httpcore-1.0.9 httpx-0.28.1 idna-3.11 isodate-0.7.2 jiter-0.12.0 msal-1.34.0 msal-extensions-1.3.1 msrest-0.7.1 oauthlib-3.3.1 openai-2.15.0 pycparser-2.23 pydantic-2.12.5 pydantic-core-2.41.5 python-dotenv-1.2.1 requests-2.32.5 requests-oauthlib-2.0.0 sniffio-1.3.1 tqdm-4.67.1 typing-inspection-0.4.2 urllib3-2.6.3\n", + "Note: you may need to restart the kernel to use updated packages.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n", + "[notice] A new release of pip is available: 24.0 -> 25.3\n", + "[notice] To update, run: python.exe -m pip install --upgrade pip\n" + ] + } + ], + "source": [ + "pip install -r requirements.txt" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2. Import Libraries" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "All libraries imported successfully\n" + ] + } + ], + "source": [ + "import os\n", + "from dotenv import load_dotenv\n", + "from azure.ai.projects import AIProjectClient\n", + "from azure.identity import DefaultAzureCredential\n", + "\n", + "print(\"All libraries imported successfully\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 3. Configure Azure Environment\n", + "\n", + "Set your Microsoft Foundry Project endpoint, model name and other environment variables. We're using **o4-mini** in this example for RFT. Copy the file `.env.template` (located in this folder), and save it as file named `.env`. Enter appropriate values for the environment variables used for the job you want to run.\n", + "\n", + "```\n", + "MICROSOFT_FOUNDRY_PROJECT_ENDPOINT=\n", + "AZURE_SUBSCRIPTION_ID=\n", + "AZURE_RESOURCE_GROUP=\n", + "AZURE_AOAI_ACCOUNT=\n", + "MODEL_NAME=o4-mini\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "load_dotenv()\n", + "\n", + "endpoint = os.environ.get(\"MICROSOFT_FOUNDRY_PROJECT_ENDPOINT\")\n", + "model_name = os.environ.get(\"MODEL_NAME\", \"o4-mini\")\n", + "\n", + "# Define RFT dataset file paths\n", + "training_file_path = \"training_rft.jsonl\"\n", + "validation_file_path = \"validation_rft.jsonl\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 4. Connect to Microsoft Foundry Project\n", + "\n", + "Connect to Microsoft Foundry Project using Azure credential authentication. This initializes the project client and OpenAI client needed for fine-tuning workflows.\n", + "\n", + "**Important**: Ensure you have the **Azure AI User** role assigned to your account for the Microsoft Foundry Project resource." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Connected to Microsoft Foundry Project\n" + ] + } + ], + "source": [ + "credential = DefaultAzureCredential()\n", + "project_client = AIProjectClient(endpoint=endpoint, credential=credential)\n", + "openai_client = project_client.get_openai_client()\n", + "\n", + "print(\"Connected to Microsoft Foundry Project\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5. Define Mathematical Grader for RFT\n", + "\n", + "Reinforcement Fine-Tuning (RFT) requires a grader function to evaluate model outputs. Unlike SFT which learns from examples, RFT learns from reward signals.\n", + "\n", + "For mathematical reasoning, we use a **Python-based grader** that deterministically verifies the correctness of the final answer by:\n", + "1. Extracting the answer from the model's response using `\\\\boxed{}` notation\n", + "2. Comparing it against the reference answer from the training data\n", + "3. Returning a score of 1.0 for correct answers and 0.0 for incorrect ones\n", + "\n", + "This approach is more appropriate for math problems than LLM-based scoring, as mathematical correctness is objective." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Python RFT grader configured successfully\n" + ] + } + ], + "source": [ + "# Python-based grader for Azure OpenAI RFT\n", + "# Compares final answers using \\\\boxed{} notation only\n", + "\n", + "grading_function = \"\"\"import re\n", + "\n", + "def normalize(ans: str):\n", + " try:\n", + " if not isinstance(ans, str):\n", + " return []\n", + " parts = re.split(r\"[,\\s]+\", ans.strip())\n", + " return sorted(p for p in parts if p)\n", + " except Exception:\n", + " return []\n", + "\n", + "\n", + "def extract_model_answer(text: str):\n", + " try:\n", + " if not text or not isinstance(text, str):\n", + " return \"\"\n", + " \n", + " pattern = r\"\\\\\\\\boxed\\\\{\"\n", + " matches = list(re.finditer(pattern, text))\n", + " if not matches:\n", + " return \"\"\n", + " \n", + " last_match = matches[-1]\n", + " start = last_match.end()\n", + " \n", + " brace_count = 1\n", + " i = start\n", + " while i < len(text) and brace_count > 0:\n", + " if text[i] == '{':\n", + " brace_count += 1\n", + " elif text[i] == '}':\n", + " brace_count -= 1\n", + " i += 1\n", + " \n", + " if brace_count == 0:\n", + " return text[start:i-1].strip()\n", + " \n", + " return \"\"\n", + " except Exception:\n", + " return \"\"\n", + "\n", + "\n", + "def grade(sample, item):\n", + " try:\n", + " # Get model output - handle both dict and object access\n", + " if isinstance(sample, dict):\n", + " output_text = sample.get(\"output_text\", \"\") or sample.get(\"output_json\", \"\")\n", + " else:\n", + " output_text = getattr(sample, \"output_text\", \"\") or getattr(sample, \"output_json\", \"\")\n", + " \n", + " # Get reference answer\n", + " if isinstance(item, dict):\n", + " ref_raw = item.get(\"answer\", \"\")\n", + " else:\n", + " ref_raw = getattr(item, \"answer\", \"\")\n", + " \n", + " # Handle None or empty values\n", + " if not output_text:\n", + " return 0.0\n", + " if not ref_raw:\n", + " return 0.0\n", + " \n", + " # Convert output_json to string if it's a dict/object\n", + " if isinstance(output_text, dict):\n", + " output_text = str(output_text)\n", + " \n", + " pred_raw = extract_model_answer(str(output_text))\n", + " \n", + " if not pred_raw:\n", + " return 0.0\n", + "\n", + " pred = normalize(pred_raw)\n", + " ref = normalize(str(ref_raw))\n", + "\n", + " return 1.0 if pred == ref else 0.0\n", + " except Exception:\n", + " # Always return 0.0 on any error to prevent job failure\n", + " return 0.0\n", + "\"\"\"\n", + "\n", + "grader = {\n", + " \"type\": \"python\",\n", + " \"source\": grading_function\n", + "}\n", + "\n", + "print(\"Python RFT grader configured successfully\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 5.1. Test Grader Function (Optional)\n", + "\n", + "Test the grader locally with a sample from the training data to verify it works correctly before submitting the RFT job." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "PASS | Simple numeric answer - CORRECT\n", + "PASS | Simple numeric answer - WRONG\n", + "PASS | Missing \\boxed{} - should FAIL even if answer is correct\n", + "PASS | Multiple \\boxed{} - uses LAST one\n", + "PASS | Comma-separated values - order doesn't matter\n", + "PASS | Space-separated values - normalized\n", + "PASS | LaTeX expression\n", + "PASS | Empty \\boxed{}\n", + "ALL TESTS PASSED\n" + ] + } + ], + "source": [ + "exec(grading_function, globals())\n", + "\n", + "test_cases = [\n", + " {\n", + " \"name\": \"Simple numeric answer - CORRECT\",\n", + " \"sample\": {\"output_text\": \"The final answer is \\\\boxed{42}\"},\n", + " \"item\": {\"answer\": \"42\"},\n", + " \"expected\": 1.0\n", + " },\n", + " {\n", + " \"name\": \"Simple numeric answer - WRONG\",\n", + " \"sample\": {\"output_text\": \"The final answer is \\\\boxed{99}\"},\n", + " \"item\": {\"answer\": \"42\"},\n", + " \"expected\": 0.0\n", + " },\n", + " {\n", + " \"name\": \"Missing \\\\boxed{} - should FAIL even if answer is correct\",\n", + " \"sample\": {\"output_text\": \"The answer is 42\"},\n", + " \"item\": {\"answer\": \"42\"},\n", + " \"expected\": 0.0\n", + " },\n", + " {\n", + " \"name\": \"Multiple \\\\boxed{} - uses LAST one\",\n", + " \"sample\": {\"output_text\": \"First: \\\\boxed{wrong}, Final: \\\\boxed{42}\"},\n", + " \"item\": {\"answer\": \"42\"},\n", + " \"expected\": 1.0\n", + " },\n", + " {\n", + " \"name\": \"Comma-separated values - order doesn't matter\",\n", + " \"sample\": {\"output_text\": \"Answer: \\\\boxed{3, 2, 1}\"},\n", + " \"item\": {\"answer\": \"1, 2, 3\"},\n", + " \"expected\": 1.0\n", + " },\n", + " {\n", + " \"name\": \"Space-separated values - normalized\",\n", + " \"sample\": {\"output_text\": \"Answer: \\\\boxed{9 6 4}\"},\n", + " \"item\": {\"answer\": \"4, 6, 9\"},\n", + " \"expected\": 1.0\n", + " },\n", + " {\n", + " \"name\": \"LaTeX expression\",\n", + " \"sample\": {\"output_text\": \"Solution: \\\\boxed{\\\\sqrt{2}}\"},\n", + " \"item\": {\"answer\": \"\\\\sqrt{2}\"},\n", + " \"expected\": 1.0\n", + " },\n", + " {\n", + " \"name\": \"Empty \\\\boxed{}\",\n", + " \"sample\": {\"output_text\": \"Answer: \\\\boxed{}\"},\n", + " \"item\": {\"answer\": \"42\"},\n", + " \"expected\": 0.0\n", + " }\n", + "]\n", + "all_passed = True\n", + "\n", + "for test in test_cases:\n", + " score = grade(test[\"sample\"], test[\"item\"])\n", + " passed = (score == test[\"expected\"])\n", + " all_passed = all_passed and passed\n", + " status = \"PASS\" if passed else \"FAIL\"\n", + " print(f\"{status} | {test['name']}\")\n", + " if not passed:\n", + " print(f\" Expected: {test['expected']}, Got: {score}\")\n", + "if all_passed:\n", + " print(\"ALL TESTS PASSED\")\n", + "else:\n", + " print(\"SOME TESTS FAILED\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 6. Upload Training Files\n", + "\n", + "Upload the training and validation JSONL files to Microsoft Foundry. Each file is assigned a unique ID that will be referenced when creating the fine-tuning job.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Uploading training file...\n", + "Uploading validation file...\n", + "Training file ID: file-2025c178eb324ee7afa324d4c4c41785\n", + "Validation file ID: file-32e0620690084456860349c09c01140c\n" + ] + } + ], + "source": [ + "print(\"Uploading training file...\")\n", + "with open(training_file_path, \"rb\") as f:\n", + " train_file = openai_client.files.create(file=f, purpose=\"fine-tune\")\n", + "\n", + "print(\"Uploading validation file...\")\n", + "with open(validation_file_path, \"rb\") as f:\n", + " validation_file = openai_client.files.create(file=f, purpose=\"fine-tune\")\n", + "\n", + "train_file_id = train_file.id\n", + "val_file_id = validation_file.id\n", + "\n", + "print(f\"Training file ID: {train_file_id}\")\n", + "print(f\"Validation file ID: {val_file_id}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 7. Wait for File Processing\n", + "\n", + "Microsoft Foundry needs to process the uploaded files before they can be used for fine-tuning. This step ensures the files are validated and ready for the RFT job." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Waiting for files to be processed...\n", + "Files ready!\n" + ] + } + ], + "source": [ + "print(\"Waiting for files to be processed...\")\n", + "openai_client.files.wait_for_processing(train_file_id)\n", + "openai_client.files.wait_for_processing(val_file_id)\n", + "print(\"Files ready!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 8. Create Reinforcement Fine-Tuning Job\n", + "\n", + "Create a reinforcement fine-tuning job with your uploaded datasets and grader function. Configure hyperparameters to control the training process." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Creating Reinforcement Fine-Tuning job...\n", + "Fine-tuning job created!\n", + "Job ID: ftjob-921f079fed5649b68e5e9ee53454233a\n", + "Status: pending\n", + "Model: o4-mini-2025-04-16\n" + ] + } + ], + "source": [ + "print(\"Creating Reinforcement Fine-Tuning job...\")\n", + "\n", + "fine_tuning_job = openai_client.fine_tuning.jobs.create(\n", + " training_file=train_file_id,\n", + " validation_file=val_file_id,\n", + " model=model_name,\n", + " method={\n", + " \"type\": \"reinforcement\",\n", + " \"reinforcement\": {\n", + " \"grader\": grader,\n", + " \"hyperparameters\": {\n", + " \"n_epochs\": 2,\n", + " \"batch_size\": 1,\n", + " \"learning_rate_multiplier\": 1.0,\n", + " \"eval_interval\": 5,\n", + " \"eval_samples\": 2,\n", + " \"reasoning_effort\": \"medium\"\n", + " }\n", + " }\n", + " },\n", + " extra_body={\n", + " \"trainingType\": \"GlobalStandard\"\n", + " },\n", + " suffix=\"math-reasoning-rft\"\n", + ")\n", + "\n", + "print(f\"Fine-tuning job created!\")\n", + "print(f\"Job ID: {fine_tuning_job.id}\")\n", + "print(f\"Status: {fine_tuning_job.status}\")\n", + "print(f\"Model: {fine_tuning_job.model}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 9. Monitor Training Progress\n", + "\n", + "Track the status of your fine-tuning job. You can view the current status, and recent training events. Training duration varies based on dataset size, model, and hyperparameters - typically ranging from minutes to several hours." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Status: pending\n" + ] + } + ], + "source": [ + "job_status = openai_client.fine_tuning.jobs.retrieve(fine_tuning_job.id)\n", + "print(f\"Status: {job_status.status}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 10. Retrieve Fine-Tuned Model\n", + "\n", + "After the fine-tuning job succeeded, retrieve the fine-tuned model ID. This ID is required to make inference calls with your customized model." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Status: pending\n" + ] + } + ], + "source": [ + "completed_job = openai_client.fine_tuning.jobs.retrieve(fine_tuning_job.id)\n", + "\n", + "if completed_job.status == \"succeeded\":\n", + " fine_tuned_model = completed_job.fine_tuned_model\n", + " print(f\"Fine-tuned Model ID: {fine_tuned_model}\")\n", + "else:\n", + " print(f\"Status: {completed_job.status}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 11. Deploy Fine-Tuned Model\n", + "\n", + "Deploy the fine-tuned model to Azure OpenAI as a deployment endpoint. This step is required before making inference calls. The deployment uses GlobalStandard SKU with 50 capacity." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from azure.mgmt.cognitiveservices import CognitiveServicesManagementClient\n", + "from azure.mgmt.cognitiveservices.models import Deployment, DeploymentProperties, DeploymentModel, Sku\n", + "\n", + "subscription_id = os.environ.get(\"AZURE_SUBSCRIPTION_ID\")\n", + "resource_group = os.environ.get(\"AZURE_RESOURCE_GROUP\")\n", + "account_name = os.environ.get(\"AZURE_AOAI_ACCOUNT\")\n", + "\n", + "deployment_name = \"o4-mini-math-reasoning-rft\"\n", + "\n", + "with CognitiveServicesManagementClient(credential=credential, subscription_id=subscription_id) as cogsvc_client:\n", + " deployment_model = DeploymentModel(format=\"OpenAI\", name=fine_tuned_model, version=\"1\")\n", + " deployment_properties = DeploymentProperties(model=deployment_model)\n", + " deployment_sku = Sku(name=\"GlobalStandard\", capacity=200)\n", + " deployment_config = Deployment(properties=deployment_properties, sku=deployment_sku)\n", + " \n", + " print(f\"Deploying fine-tuned model: {fine_tuned_model}\")\n", + " deployment = cogsvc_client.deployments.begin_create_or_update(\n", + " resource_group_name=resource_group,\n", + " account_name=account_name,\n", + " deployment_name=deployment_name,\n", + " deployment=deployment_config,\n", + " )\n", + " \n", + " print(\"Waiting for deployment to complete...\")\n", + " deployment.result()\n", + "\n", + "print(f\"Model deployment completed: {deployment_name}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 12. Test Fine-Tuned Model\n", + "\n", + "Test your fine-tuned model by solving a mathematical reasoning problem." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "test_problem = \"\"\"Solve for x: 3x + 7 = 22\"\"\"\n", + "\n", + "print(\"Testing fine-tuned model...\")\n", + "print(f\"Problem:{test_problem}\")\n", + "\n", + "response = openai_client.responses.create(\n", + " model=deployment_name,\n", + " input=[\n", + " {\"role\": \"system\", \"content\": \"You are a mathematical reasoning assistant. Solve problems step-by-step, showing your work clearly. Provide the final answer in \\\\boxed{} format.\"},\n", + " {\"role\": \"user\", \"content\": test_problem}\n", + " ]\n", + ")\n", + "\n", + "print(f\"Model Response:{response.output_text}\")\n" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "env", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/Demos/RFT_Math_Reasoning/training_rft.jsonl b/Demos/RFT_Math_Reasoning/training_rft.jsonl new file mode 100644 index 0000000..df428dd --- /dev/null +++ b/Demos/RFT_Math_Reasoning/training_rft.jsonl @@ -0,0 +1,100 @@ +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nCall a scalene triangle K [i]disguisable[/i] if there exists a triangle K′ similar to K with two shorter sides precisely as long as the two longer sides of K, respectively. Call a disguisable triangle [i]integral[/i] if the lengths of all its sides are integers.\r\n(a) Find the side lengths of the integral disguisable triangle with the smallest possible perimeter.\r\n(b) Let K be an arbitrary integral disguisable triangle for which no smaller integral\r\ndisguisable triangle similar to it exists. Prove that at least two side lengths of K are\r\nperfect squares.", "role": "user"}], "answer": "9"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n11. A math competition problem: the probabilities of A, B, and C solving this problem independently are $\\frac{1}{a}$, $\\frac{1}{b}$, and $\\frac{1}{c}$, respectively, where $a$, $b$, and $c$ are positive integers less than 10. Now A, B, and C are solving this problem independently. If the probability that exactly one of them solves the problem is $\\frac{7}{15}$, then the probability that none of them solves the problem is", "role": "user"}], "answer": "\\dfrac{4"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n8 There are six cards, each with a number $1, 2, 3, 4, 5, 6$ written on it. Each time, one card is drawn, the number on it is noted, and then it is put back. This is done 4 times. The probability that the difference between the maximum and minimum numbers drawn is equal to 5 is $\\qquad$.", "role": "user"}], "answer": "\\dfrac{151"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n5th Swedish 1965 Problem 2 Find all positive integers m, n such that m 3 - n 3 = 999.", "role": "user"}], "answer": "(12, 9)"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n\n5. Given three positive real numbers $o, \\rho, v$, construct a triangle $A B C$ with perimeter equal to $o$, the radius of its $A$-excircle equal to $\\rho$, and the length of its $A$ altitude equal to $v$. Determine the number of solutions (non-congruent triangles) in terms of the given lengths.\n\n(Patrik Bak)\n", "role": "user"}], "answer": "0"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n3. In the afternoon, 5 classes need to be scheduled: Physics, Chemistry, Biology, and two self-study periods. If the first class cannot be Biology and the last class cannot be Physics, then the number of different scheduling methods is ( ) kinds.\n(A) 36\n(B) 39\n(C) 60\n(D) 78", "role": "user"}], "answer": "B"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n9. A, B, C, and D are comparing their heights. Among them, the sum of the heights of two people is the same as the sum of the heights of the other two. The average height of A and B is 4 cm more than the average height of A and C. D is 10 cm taller than A, and the sum of the heights of B and C is 288 cm. A's height is $\\qquad$ cm.", "role": "user"}], "answer": "139"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n1. Solve the equation $x^{\\log _{5}(0.008 x)}=\\frac{125}{x^{5}}$.", "role": "user"}], "answer": "\\dfrac{1"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n4.9 If $b0\n$$", "role": "user"}], "answer": "1, 2, 3, 4, 5, 6, 7"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nGiven $\\triangle A B C$ has a perimeter of 20, area of $01 \\sqrt{3}$, and $\\angle A=60^{\\circ}$. Find $\\sin A: \\sin B: \\sin C$.", "role": "user"}], "answer": "7:8:5"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nB2 Points $A(-4,-2), B(2,-2)$ and $C(2,6)$ are the vertices of triangle $A B C$.\n\nA Draw triangle $A B C$ in the coordinate system.\n\nB Calculate the perimeter of triangle $A B C$.\n\nC Calculate the area of triangle $A B C$.\n\nD Circumscribe a circle around triangle $A B C$.", "role": "user"}], "answer": "(x + 1)^2 + (y - 2)^2 = 25"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n【Question 1】Calculate: $2 \\times(999999+5 \\times 379 \\times 4789)=$", "role": "user"}], "answer": "20150308"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n2. Simplify the expression\n\n$$\nA=\\left(1+\\frac{1+i}{2}\\right)\\left(1+\\left(\\frac{1+i}{2}\\right)^{2}\\right)\\left(1+\\left(\\frac{1+i}{2}\\right)^{4}\\right)\\left(1+\\left(\\frac{1+i}{2}\\right)^{8}\\right)\n$$\n\n( $i$ is the imaginary unit)", "role": "user"}], "answer": "\\dfrac{255(1 + i)"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nExample 3 (1) Find the domain of the function $y=\\sqrt{\\frac{1}{2}-\\log _{2}(3-x)}$;\n(2) Given that the domain of $f(x)$ is $[0,1]$, find the domain of the function $y=f\\left(\\log _{\\frac{1}{2}}(3-x)\\right)$.", "role": "user"}], "answer": "[2, \\frac{5"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n8. As shown in Figure $3, A C=$ $B C, A C \\perp B C$ at point $C, A B=A D=B D$, $C D=C E=D E$. If $A B=\\sqrt{2}$, then $B E=$", "role": "user"}], "answer": "1"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nAt each vertex of a tetrahedron, there sits an ant. At a given moment, each of them starts to move along a randomly chosen edge and crosses over to the adjacent vertex. What is the probability that two ants meet either on the way or at the end of their journey?", "role": "user"}], "answer": "\\dfrac{25"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n7. A turtle and a rabbit are in a 1000-meter race. The rabbit took a nap 200 meters from the finish line. When it woke up, it found that the turtle was 10 meters from the finish line, so it immediately chased after and eventually both reached the finish line at the same time. If the turtle crawled at a constant speed throughout the race, and the rabbit's speed before and after the nap was the same, then, during the rabbit's nap, the turtle crawled $\\qquad$ meters.", "role": "user"}], "answer": "950"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nWe are given 5771 weights weighing 1,2,3,...,5770,5771. We partition the weights into $n$ sets of equal weight. What is the maximal $n$ for which this is possible?", "role": "user"}], "answer": "2886"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n7. As shown in the figure, the beads on the bracelet are numbered from 1 to 22 in a counterclockwise direction starting from the pendant bead. Xiao Ming is playing a bead counting game, with the rule being: starting from bead 1, count natural numbers in a clockwise direction, but skip any number that contains the digit 7 or is a multiple of 7, and directly count the next number. For example: after counting to 6, the next number is 8; after counting to 13, the next number is 15, and so on. So, when counting to 100, which bead number $\\qquad$ will it land on?", "role": "user"}], "answer": "19"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nLet $ABC$ be a triangle and let $P$ be a point in its interior. Suppose $ \\angle B A P = 10 ^ { \\circ } , \\angle A B P = 20 ^ { \\circ } , \\angle P C A = 30 ^ { \\circ } $ and $ \\angle P A C = 40 ^ { \\circ } $. Find $ \\angle P B C $.", "role": "user"}], "answer": "60^\\circ"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n(2) In the Cartesian coordinate system $x O y$, the area of the figure enclosed by the curve $2|x|+3|y|=5$ is ( ).\n(A) $\\frac{5}{3}$\n(B) 5\n(C) $\\frac{20}{3}$\n(D) $\\frac{25}{3}$", "role": "user"}], "answer": "D"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nGiven a digits {$0,1,2,...,9$} . Find the number of numbers of 6 digits which cantain $7$ or $7$'s digit and they is permulated(For example 137456 and 314756 is one numbers).", "role": "user"}], "answer": "2002"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n[ Skeletons of Polyhedral Figures ]\n\nA toy factory produces wireframe cubes, with small colorful beads located at their vertices. According to the GOST standard, each cube must use beads of all eight colors (white and the seven colors of the rainbow). How many different models of cubes can the factory produce?\n\n#", "role": "user"}], "answer": "1680"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n## Problem Statement\n\nCalculate the limit of the function:\n\n$\\lim _{x \\rightarrow 0} \\frac{6^{2 x}-7^{-2 x}}{\\sin 3 x-2 x}$", "role": "user"}], "answer": "2 \\ln 42"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nG5.3 If $R^{2000}<5^{3000}$, where $R$ is a positive integer, find the largest value of $R$.", "role": "user"}], "answer": "11"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n## Task 18/78\n\nDetermine the greatest common divisor of all numbers $z$ that can be represented in the form $z=n^{4 m+1}-n$ with $m ; n \\in N$!", "role": "user"}], "answer": "30"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nWhich is the two-digit number that, when divided by the digit in the units place, gives a quotient of 9 and a remainder of 6?", "role": "user"}], "answer": "78"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n5. In $\\triangle A B C$, medians $A D$ and $C F$ intersect at point $G$. If $\\angle A F G=45^{\\circ}, \\angle A G C=60^{\\circ}$, then the degree measure of $\\angle A C F$ is ( ).\n(A) $30^{\\circ}$\n(B) $45^{\\circ}$\n(C) $60^{\\circ}$\n(D) $75^{\\circ}$", "role": "user"}], "answer": "D"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nFind the maximum positive integer $k$ such that for any positive integers $m,n$ such that $m^3+n^3>(m+n)^2$, we have\n\n$$m^3+n^3\\geq (m+n)^2+k$$\n\n[i] Proposed by Dorlir Ahmeti, Albania[/i]", "role": "user"}], "answer": "10"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n11. Choose three different digits from $0,1, \\cdots, 9$ to form a four-digit number (one of the digits can appear twice), such as 5 224. Then the total number of such four-digit numbers is $\\qquad$.", "role": "user"}], "answer": "3888"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n2. The three sides of a triangle $a, b, c$ are all integers, and satisfy $a b c + b c + c a + a b + a + b + c = 7$. Then the area of the triangle is equal to ().\n(A) $\\frac{\\sqrt{3}}{2}$\n(B) $\\frac{\\sqrt{2}}{4}$\n(C) $\\frac{\\sqrt{3}}{4}$\n(D) $\\frac{\\sqrt{2}}{2}$", "role": "user"}], "answer": "C"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n## Task Condition\n\nFind the point $M^{\\prime}$ symmetric to the point $M$ with respect to the line.\n\n$M(2 ; 1 ; 0)$\n\n$\\frac{x-2}{0}=\\frac{y+1.5}{-1}=\\frac{z+0.5}{1}$", "role": "user"}], "answer": "(2, -2, -3)"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nA number is called Norwegian if it has three distinct positive divisors whose sum is equal to 2022. Determine the smallest Norwegian number. (Note: The total number of positive divisors of a Norwegian number is allowed to be larger than 3.) (Cyprus) Answer: 1344", "role": "user"}], "answer": "1344"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nB5. My 24-hour digital clock displays hours and minutes only.\nFor how many different times does the display contain at least one occurrence of the digit 5 in a 24-hour period?", "role": "user"}], "answer": "450"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n17.51 In $\\triangle A B C$, $A C: C B=3: 4, \\angle C$'s exterior angle bisector intersects the extension of $B A$ at $P$ ($A$ is between $P$ and $B$), then $P A: A B$ is\n(A) $1: 3$.\n(B) $3: 4$.\n(C) $4: 3$.\n(D) $3: 1$.\n(E) $7: 1$.\n(12th American High School Mathematics Examination, 1961)", "role": "user"}], "answer": "D"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n7. The program flowchart is shown in Figure 3. Given that the real number $x \\in [1, 9]$. Then the probability that the output $x$ is not less than 55 is ( ).\n(A) $\\frac{1}{3}$\n(B) $\\frac{2}{3}$\n(C) $\\frac{3}{8}$\n(D) $\\frac{5}{8}$", "role": "user"}], "answer": "D"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n18. There are 30 students standing in a row, and from left to right, they are wearing hats in the order of \"red, yellow, blue, red, yellow, blue......\". The PE teacher asks the students to count off from left to right as \"1, 2, 1, 2, 1, 2.....\", and those who count off 2 step forward. Among the remaining students, those wearing red hats step back. At this point, the students are divided into 3 rows, and in the middle row, there are $\\qquad$ people wearing blue hats.", "role": "user"}], "answer": "5"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n1. 19 If $\\frac{1}{x}+\\frac{1}{y}+\\frac{1}{z}=\\frac{1}{x+y+z}=1$. Then,\n(A) at least one of $x, y, z$ is equal to 1.\n(B) $x, y, z$ are all equal to 1.\n(C) $x, y, z$ are all not equal to 1.\n(D) none of the above conclusions is correct.\n(8th \"Jinyun Cup\" Junior High School Mathematics Invitational, 1991)", "role": "user"}], "answer": "A"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n3. Let $x_{1}$ and $x_{2}$ be the roots of the equation\n\n$$\np^{2} x^{2}+p^{3} x+1=0\n$$\n\nDetermine $p$ such that the expression $x_{1}^{4}+x_{2}^{4}$ has the smallest value.", "role": "user"}], "answer": "\\pm \\sqrt{2"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n663. The grasshopper jumped from the point with coordinate 8 to the point with coordinate 17.5. Then it made another jump of the same length (in the same direction). At the point with what coordinate did it end up?", "role": "user"}], "answer": "27"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n## Task 1 - 251211\n\nDetermine all pairs $(x ; y)$ of real numbers that satisfy the following system of equations (1), (2).\n\n$$\n\\begin{aligned}\n& x^{2}+y=1 \\\\\n& x+y^{2}=1\n\\end{aligned}\n$$", "role": "user"}], "answer": "\\left( \\frac{-1 - \\sqrt{5"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nThe radius of the Earth measures approximately $6378 \\mathrm{~km}$ at the Equator. Suppose a wire is exactly adjusted over the Equator.\n\nThen, suppose the length of the wire is increased by $1 \\mathrm{~m}$, so that the wire and the Equator form concentric circles around the Earth. Would a man standing, an ant, or an elephant be able to pass under this wire?", "role": "user"}], "answer": "An ant"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n10) In a bag there are some marbles. Maria says: \"In the bag there are three marbles in total and they are black.\" Luca says: \"In the bag there are two black marbles and two red marbles.\" Giorgio says: \"In the bag there are only black marbles.\" Knowing that only one of the three is lying, how many marbles are in the bag?\n\n(A) one, (B) two, (C) three, (D) four, (E) it is not possible to determine the number based on the information given in the problem.", "role": "user"}], "answer": "C"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n9. Four cousins Alan, Bob, Carl and Dan are $3,8,12$ and 14 years old, although not necessarily in that order. Alan is younger than Carl. The sum of the ages of Alan and Dan is divisible by 5. The sum of the ages of Carl and Dan is divisible by 5. What is the sum of the ages of Alan and Bob?\nA 26\nB 22\nC 17\nD 15\nE 11", "role": "user"}], "answer": "C"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n4. In $\\triangle A B C$, $\\angle A=30^{\\circ}, \\angle B=105^{\\circ}$, a line $D E$ is drawn through a point $D$ on side $A C$, intersecting side $A B$ or side $B C$ at point $E$, such that $\\angle C D E=60^{\\circ}$, and $D E$ bisects the area of $\\triangle A B C$. Then $\\left(\\frac{C D}{A C}\\right)^{2}=$ $\\qquad$ -", "role": "user"}], "answer": "\\dfrac{\\sqrt{3"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n1. (1) Use the Factor Theorem to prove that $a-b, b-c, c-a$ are all factors of $a^{2}(b-c)+b^{2}(c-a)+c^{2}(a-b)$.\n(2) Using the conclusion from (1), factorize $a^{2}(b-c)+b^{2}(c-a)+c^{2}(a-b)$.\n(3) Factorize: $(x+y+z)^{3}-x^{3}-y^{3}-z^{3}$.", "role": "user"}], "answer": "3(x + y)(y + z)(z + x)"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n## Problem Statement\n\nFind the point of intersection of the line and the plane.\n\n$\\frac{x+1}{3}=\\frac{y-3}{-4}=\\frac{z+1}{5}$\n\n$x+2 y-5 z+20=0$", "role": "user"}], "answer": "(2, -1, 4)"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nThe repeating decimals $0.abab\\overline{ab}$ and $0.abcabc\\overline{abc}$ satisfy\n\\[0.abab\\overline{ab}+0.abcabc\\overline{abc}=\\frac{33}{37},\\]\nwhere $a,b$, and $c$ are (not necessarily distinct) digits. Find the three-digit number $abc$.", "role": "user"}], "answer": "447"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n2 Find all integer $x$ such that $\\left|4 x^{2}-12 x-27\\right|$ is a prime number.\n\nFind all integer $x$ such that $\\left|4 x^{2}-12 x-27\\right|$ is a prime number.", "role": "user"}], "answer": "5"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n10,11\n\nThe angle in the development of the lateral surface of a cone is $120^{\\circ}$. Find the angle at the vertex of the axial section of the cone.", "role": "user"}], "answer": "\\arccos \\left( \\dfrac{7"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nExample 6 Find the number of $n$-digit numbers formed by $1,2,3$, where each of 1, 2, and 3 must appear at least once in the $n$-digit number.", "role": "user"}], "answer": "3^n - 3 \\cdot 2^n + 3"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n15 . In a $3 \\times 3$ grid, fill in the numbers $1,2,3,4,5,6,7,8,9$ so that the sum of the numbers in any three squares in a row, column, or diagonal is equal.", "role": "user"}], "answer": "\\begin{array"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n\n4. How many 6-tuples $\\left(a_{1}, a_{2}, a_{3}, a_{4}, a_{5}, a_{6}\\right)$ are there such that each of $a_{1}, a_{2}, a_{3}, a_{4}, a_{5}, a_{6}$ is from the set $\\{1,2,3,4\\}$ and the six expressions\n\n$$\na_{j}^{2}-a_{j} a_{j+1}+a_{j+1}^{2}\n$$\n\nfor $j=1,2,3,4,5,6$ (where $a_{7}$ is to be taken as $a_{1}$ ) are all equal to one another?\n", "role": "user"}], "answer": "40"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n3. In a right trapezoid $A B C D$, $A D / / B C$, $B C \\perp C D$, $E$ is the midpoint of side $A B$, $B C=C D=C E$. Then the degree measure of $\\angle B$ is ( ).\n(A) $52.5^{\\circ}$\n(B) $62.5^{\\circ}$\n(C) $60^{\\circ}$\n(D) $75^{\\circ}$", "role": "user"}], "answer": "D"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nProblem 4. The sum of eight consecutive natural numbers is equal to 92. What are these numbers?", "role": "user"}], "answer": "15"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nIn a tournament, any two players play against each other. Each player gets one point for a win, 1/2 for a draw, and 0 points for a loss. Let $S$ be the set of the 10 lowest scores. We know that each player obtained half of their score playing against players from $S$.\n\na) What is the sum of the scores of the players in $S$?\n\nb) Determine how many participants are in the tournament.\n\nNote: Each player plays only once against each opponent.", "role": "user"}], "answer": "25"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n9. (16 points) For a given positive integer $M$, define $f_{1}(M)$ as the square of the sum of the digits of $M$. When $n>1$ and $n \\in \\mathbf{N}$, $f_{n}\\left(f_{n-1}(M)\\right)$ represents the $r_{n}$-th power of the sum of the digits of $f_{n-1}(M)$, where, when $n$ is odd, $r_{n}=2$; when $n$ is even, $r_{n}=3$. Find the value of $f_{2012}\\left(f_{2011}\\left(3^{2010}\\right)\\right)$.", "role": "user"}], "answer": "should be \\( f_{2012}(f_{2011}(3^{2010})) \\)."} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nExample 4 Given $A(-2,2)$, and $B$ is a moving point on the ellipse $\\frac{x^{2}}{25}+\\frac{y^{2}}{16}=1$, $F$ is the left focus. When $\\mid A B \\mid +\\frac{5}{3}|B F|$ takes the minimum value, find the coordinates of point $B$.\n(1999, National High School Mathematics Competition)", "role": "user"}], "answer": "\\left( -\\dfrac{5\\sqrt{3"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n3. Given the sequence $\\left\\{a_{n}\\right\\}$ satisfies $a_{1}=\\frac{7}{6}$, and for any positive integer $n$, the quadratic equation $a_{n} x^{2}-$ $a_{n+1} x+1=0$ has two roots $\\alpha_{n}, \\beta_{n}$ that satisfy $6 \\alpha_{n}-2 \\alpha_{n} \\beta_{n}+6 \\beta_{n}=3$. Then the general term formula for $\\left\\{a_{n}\\right\\}$ is $a_{n}=$ $\\qquad$", "role": "user"}], "answer": "\\dfrac{2"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n4.1. Mom baked four raisin buns for breakfast for her two sons. $V$ In the first three buns, she put 7, 7, 23 raisins, and some more in the fourth. It turned out that the boys ate an equal number of raisins and did not divide any bun into parts. How many raisins could Mom have put in the fourth bun? List all the options.", "role": "user"}], "answer": "37"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nFind the number of ways in which three numbers can be selected from the set $\\{1,2,\\cdots ,4n\\}$, such that the sum of the three selected numbers is divisible by $4$.", "role": "user"}], "answer": "\\dfrac{n(4n -1)(2n -1)"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nAs shown in Figure $3, \\odot O$ is inside the rectangle $ABCD$. Tangents are drawn from vertices $A, B, C, D$ to $\\odot O$, with points of tangency being $A_{1}, B_{1}, C_{1}, D_{1}$, respectively. If $AA_{1}=3, BB_{1}=4$, $CC_{1}=5$, find the length of $DD_{1}$.", "role": "user"}], "answer": "3\\sqrt{2"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nCondition of the problem\n\nCalculate the limit of the function:\n\n$\\lim _{x \\rightarrow 1}\\left(\\frac{x^{3}-1}{x-1}\\right)^{\\frac{1}{x^{2}}}$", "role": "user"}], "answer": "3"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n$$\n\\begin{array}{l}\n\\text { Three. (20 points) (1) The quadratic function } \\\\\nf(x)=a x^{2}+b x+c(a, b, c \\in \\mathbf{R}, a \\neq 0)\n\\end{array}\n$$\n\nsatisfies\n(i) For $x \\in \\mathbf{R}$, there is $4 x \\leqslant f(x) \\leqslant \\frac{1}{2}(x+2)^{2}$\n\nalways holds;\n(ii) $f(-4+2 \\sqrt{3})=0$.\nFind $f(x)$.\n(2) Let $f_{1}(x)=\\frac{3}{2+x}, f_{n+1}(x)=f_{1}\\left(f_{n}(x)\\right)$, find $f_{2009}(0)$.", "role": "user"}], "answer": "\\dfrac{3^{2010"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n5. Let $S=\\left\\{(x, y) \\mid x^{2}-y^{2}=\\right.$ odd, $\\left.x, y \\in R\\right\\}, T=\\{(x, y) \\mid$ $\\left.\\sin \\left(2 \\pi x^{2}\\right)-\\sin \\left(2 \\pi y^{2}\\right)=\\cos \\left(2 \\pi x^{2}\\right)-\\cos \\left(2 \\pi y^{2}\\right), x, y \\in R\\right\\}$. Then\n(A) $S \\subset T$;\n(B) $T \\subset S$;\n(C) $S=T$;\n(D) $S \\cap T=\\Phi$.", "role": "user"}], "answer": "A"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n5. In trapezoid $A B C D$, $A B / / C D$, the base angles $\\angle D A B=36^{\\circ}, \\angle C B A=54^{\\circ}, M$ and $N$ are the midpoints of sides $A B$ and $C D$, respectively. If the lower base $A B$ is exactly 2008 units longer than the upper base $C D$, then the line segment $M N=$ $\\qquad$", "role": "user"}], "answer": "1004"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nThe seventh grades of our school competed in collecting caps from PET bottles. Class A collected half of what classes B and C collected together, class B collected a third of what classes A and C collected together, and class C collected 150 caps.\n\nDetermine how many caps these three classes collected in total.\n\n(M. Volfová)", "role": "user"}], "answer": "360"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nIn a certain year the price of gasoline rose by $20\\%$ during January, fell by $20\\%$ during February, rose by $25\\%$ during March, and fell by $x\\%$ during April. The price of gasoline at the end of April was the same as it had been at the beginning of January. To the nearest integer, what is $x$\n$\\mathrm{(A)}\\ 12\\qquad \\mathrm{(B)}\\ 17\\qquad \\mathrm{(C)}\\ 20\\qquad \\mathrm{(D)}\\ 25\\qquad \\mathrm{(E)}\\ 35$", "role": "user"}], "answer": "B"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n15. Given $f(x)=x^{2}+c$, and $f(f(x))=f\\left(x^{2}+1\\right)$.\n(1) Let $g(x)=f(f(x))$, find the analytical expression of the function $g(x)$;\n(2) Let $\\varphi(x)=g(x)-\\lambda f(x)$, try to find the value of the real number $\\lambda$ such that $\\varphi(x)$ is a decreasing function on $(-\\infty,-1]$ and an increasing function on $[-1,0)$.", "role": "user"}], "answer": "4"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nThe number $6545$ can be written as a product of a pair of positive two-digit numbers. What is the sum of this pair of numbers?\n$\\text{(A)}\\ 162 \\qquad \\text{(B)}\\ 172 \\qquad \\text{(C)}\\ 173 \\qquad \\text{(D)}\\ 174 \\qquad \\text{(E)}\\ 222$", "role": "user"}], "answer": "A"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n3. Given are the sides $a = BC$ and $b = CA$ of triangle $ABC$. Determine the length of the third side if it is equal to the length of the corresponding height. For which values of $a$ and $b$ does the problem have a solution?", "role": "user"}], "answer": "\\sqrt{\\dfrac{a^2 + b^2 + 2\\sqrt{3a^2b^2 - a^4 - b^4"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n## Task 15/85\n\nLet the numbers $a ; b ; c$ represent the side lengths of a triangle with perimeter $U$, and let $a^{2}$; $b^{2} ; c^{2}$ represent the side lengths of another triangle with perimeter $U^{\\prime}$. Determine the lower bound of the ratio $U^{2}: U^{\\prime}$!", "role": "user"}], "answer": "2"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n7. Let the edge length of a regular tetrahedron be $2 \\sqrt{6}$, and let a sphere be constructed with the center $O$ of the tetrahedron as its center. The total length of the curves formed by the intersection of the sphere's surface with the four faces of the tetrahedron is $4 \\pi$. Then the radius of the sphere $O$ is $\\qquad$", "role": "user"}], "answer": "\\dfrac{\\sqrt{5"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nExample 2.65. Calculate the volume of the body formed by the rotation around the $O Y$ axis of the curvilinear trapezoid bounded by the hyperbola $x y=2$ and the lines $y_{1}=1, y_{2}=4$ and $y_{3}=0$.", "role": "user"}], "answer": "3\\pi"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nFor two quadratic trinomials $P(x)$ and $Q(x)$ there is a linear function $\\ell(x)$ such that $P(x)=Q(\\ell(x))$ for all real $x$. How many such linear functions $\\ell(x)$ can exist?\n\n[i](A. Golovanov)[/i]", "role": "user"}], "answer": "2"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n1. A car is driving at a constant speed in one direction along a straight road, near which there are two houses. At noon, when the car had not yet reached the houses, the sum of the distances from it to these houses was 10 km. After 10 minutes, when the car had already passed both houses, it turned out that the sum of the distances from it to the houses was again 10 km. What is the speed of the car? (Problem author - I. Rubanov)", "role": "user"}], "answer": "60"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n$18 \\cdot 110$ As shown, $A B C D$ is an isosceles trapezoid, $A D$ $=B C=5, A B=4, D C=10$, point $C$ is on $D F$, and $B$ is the midpoint of the hypotenuse of the right triangle $\\triangle D E F$. Then $C F$ equals\n(A) 3.25 .\n(B) 3.5 .\n(C) 3.75 .\n(D) 4.0 .\n(E) 4.25 .", "role": "user"}], "answer": "D"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n1. Given 2117 cards, on which natural numbers from 1 to 2117 are written (each card has exactly one number, and the numbers do not repeat). It is required to choose two cards such that the sum of the numbers written on them is divisible by 100. In how many ways can this be done?", "role": "user"}], "answer": "22386"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n2. (6 points) A certain duplicator can print 3600 sheets of paper per hour, so printing 240 sheets of paper requires $\\qquad$ minutes.", "role": "user"}], "answer": "4"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n1 $\\left[\\begin{array}{ll}\\text { Common fractions }\\end{array}\\right]$\n\nWhich of the three fractions is the largest: $3 / 4, 4 / 5$ or $5 / 6$?", "role": "user"}], "answer": "\\dfrac{5"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nIn the given operation, the letters $a, b$, and $c$ represent distinct digits and are different from 1. Determine the values of $a, b$, and $c$.\n\n$$\n\\begin{array}{r}\na b b \\\\\n\\times \\quad c \\\\\n\\hline b c b 1\n\\end{array}\n$$", "role": "user"}], "answer": "c=7"} \ No newline at end of file diff --git a/Demos/RFT_Math_Reasoning/validation_rft.jsonl b/Demos/RFT_Math_Reasoning/validation_rft.jsonl new file mode 100644 index 0000000..1330779 --- /dev/null +++ b/Demos/RFT_Math_Reasoning/validation_rft.jsonl @@ -0,0 +1,10 @@ +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nExample 3 Given $a=\\frac{1}{2} \\sqrt{\\sqrt{2}+\\frac{1}{8}}-\\frac{\\sqrt{2}}{8}$. Try to find the value of $a^{2}+\\sqrt{a^{4}+a+1}$.", "role": "user"}], "answer": "\\sqrt{2"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nThe surface area of a cube is 24 . The volume of the cube is\n(A) 4\n(B) $3 \\sqrt{3}$\n(C) 9\n(D) 16\n(E) 8", "role": "user"}], "answer": "E"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n2. Given $\\sin 2 x=\\frac{\\sin \\theta+\\cos \\theta}{2}$, $\\cos ^{2} x=\\sin \\theta \\cdot \\cos \\theta$.\nThen, the value of $\\cos 2 x$ is ( ).\n(A) $\\frac{-1 \\pm \\sqrt{33}}{8}$\n(B) $\\frac{-1+\\sqrt{33}}{8}$\n(C) $\\frac{-1-\\sqrt{33}}{8}$\n(D) 0", "role": "user"}], "answer": "C"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n4. Find the sum\n\n$$\n\\log \\operatorname{tg} 1^{\\circ}+\\log \\operatorname{tg} 2^{\\circ}+\\ldots+\\log \\operatorname{tg} 89^{\\circ}\n$$", "role": "user"}], "answer": "0"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n1. Find the sum of the numbers:\n\n$$\n3+33+333+3333+\\cdots+\\underbrace{33333 \\ldots 3}_{2018 \\text { of them }} .\n$$", "role": "user"}], "answer": "\\dfrac{10^{2019"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\n12.407 A line perpendicular to the chord of a segment divides the chord in the ratio 1:4, and the arc - in the ratio $1: 2$. Find the cosine of the central angle subtended by this arc.", "role": "user"}], "answer": "-\\dfrac{23"} +{"messages": [{"content": "You are a mathematical reasoning expert. Solve problems with detailed step-by-step thinking and provide final answers in \\boxed{} format. Show all intermediate calculations and explain your reasoning clearly.\n\nFor some positive integer $n$, a coin will be flipped $n$ times to obtain a sequence of $n$ heads and tails. For each flip of the coin, there is probability $p$ of obtaining a head and probability $1-p$ of obtaining a tail, where $0