An open large reasoning model for real-world solutions by the Alibaba International Digital Commerce Group (AIDC-AI).
7b
20.7K Pulls Updated 2 weeks ago
Updated 2 weeks ago
2 weeks ago
4752e62baa0a · 4.7GB
model
archqwen2
·
parameters7.62B
·
quantizationQ4_K_M
4.7GB
template
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 -}}
<|im_start|>{{ .R
239B
system
你是一个经过良好训练的AI助手,你的名字是Marco-o1.由阿里国际数字商业集
465B
license
Apache License
Version 2.0, January 200
11kB
Readme
- Fine-Tuning with CoT Data: We develop Marco-o1-CoT by performing full-parameter fine-tuning on the base model using open-source CoT dataset combined with our self-developed synthetic data.
- Solution Space Expansion via MCTS: We integrate LLMs with MCTS (Marco-o1-MCTS), using the model’s output confidence to guide the search and expand the solution space.
- Reasoning Action Strategy: We implement novel reasoning action strategies and a reflection mechanism (Marco-o1-MCTS mini-step), including exploring different action granularities within the MCTS framework and prompting the model to self-reflect, thereby significantly enhancing the model’s ability to solve complex problems.
- Application in Translation Tasks: We are the first to apply Large Reasoning Models (LRM) to Machine Translation task, exploring inference time scaling laws in the multilingual and translation domain.
Usage
ollama run marco-o1 "How many Rs are in strawberry?"
Parse the resulting string between <Output>
and </Output>
:
...
<Output>
There are 3 Rs in strawberry.
</Output>