Accuracy (+39.0)
Accuracy improved because tier-weighted error rate moved from 0.0 to 0.6 per 100 words, with core (T1) errors moving from 0 to 2.
Mistake: T1 · "i are"
Lesson scoring, progression, and evidence panels built to tell one learner's story end-to-end — headline trend, lesson-level evidence, and a radar for every lesson, all on a single page.
Judgment Criteria
Switch users at the top, scan the summary strip, then move from trend to lesson evidence without leaving the page.
Users
Use the top switcher to move between users, then scan one learner’s full progression on a single page: headline trend, lesson-by-lesson metrics, and a radar for every lesson.
+16.5 vs baseline
69.5 peak score
One learner story in chronological order
Lesson-level Progress Index in chronological order.
How the five fixed subscores moved lesson to lesson.
Click the i button next to any metric name to see what it means and how it is measured.
| Metric | 1 | 2 | 3 |
|---|---|---|---|
Progress Index iWhat it shows: The overall speaking-based lesson score used to compare the student across lessons. How we measure it: Each raw metric is mapped to 0-100 using fixed anchors (not cohort-relative), then combined into five subscores and rolled up as 25% Fluency, 25% Accuracy, 20% Complexity, 15% Lexical Range, and 15% Engagement. Reading turns are tracked separately and excluded. | 52.938 | 52.274 | 69.471 |
Student WPM iWhat it shows: How fast the student produced alphabetic words during speaking turns only. How we measure it: Speaking alpha-token count divided by speaking-turn minutes. Reading turns are excluded and shown separately in the reading activity section. | 44.145 | 49.764 | 83.356 |
Student Talk Ratio (%) iWhat it shows: How much of the speaking-focused lesson word volume came from the student instead of the tutor. How we measure it: 100 times speaking student alpha tokens divided by speaking student alpha tokens plus tutor alpha tokens from speaking-focused interactions. | 59.44 | 34.874 | 68.324 |
Avg Turn Words iWhat it shows: How much the student tends to say each time they take a speaking turn. How we measure it: Mean alphabetic word count per speaking student turn, recomputed over the lesson from speaking-only turns. | 19.594 | 21.721 | 43.141 |
Response Latency (s) iWhat it shows: How quickly the student answers after the tutor finishes speaking in speaking-focused exchanges. How we measure it: Median gap in seconds between a tutor turn ending and the next student speaking turn beginning, recomputed over the full lesson timeline. | 0.265 | 0 | 0 |
Elaboration Depth iWhat it shows: How much detail the student gives after open tutor prompts when responding in speaking mode. How we measure it: Average speaking-response length in words after tutor prompts such as why, how, tell me, or describe. | 6.308 | 9.171 | 17.879 |
Avg Sentence Length iWhat it shows: How long the student's speaking sentences are on average. How we measure it: Mean alphabetic token count per detected speaking sentence, recomputed over the lesson from speaking-only sentences. | 4.178 | 4.693 | 9.079 |
Sentences 10+ Words iWhat it shows: How many student sentences are at least ten words long. How we measure it: Count of detected student sentences whose alphabetic token count is 10 or more, summed across the lesson. | 29 | 67 | 151 |
MATTR-50 iWhat it shows: Vocabulary variety in the student's speaking turns, with less distortion from lesson length than a simple unique-word ratio. How we measure it: Moving-average type-token ratio over 50-word windows from speaking tokens only; if there are fewer than 50 speaking words, we fall back to unique tokens divided by total tokens. | 0.629 | 0.508 | 0.673 |
Complex Words (6+ chars) iWhat it shows: How often the student uses longer, more information-dense word forms while speaking. How we measure it: Count of speaking alpha tokens with 6 or more characters after deterministic cleanup and repeat collapsing. | 218 | 281 | 734 |
Long Words (8+ chars %) iWhat it shows: The share of speaking words that are especially long. How we measure it: 100 times speaking alpha tokens with 8 or more characters divided by all speaking alpha tokens. | 5.874 | 5.042 | 8.493 |
Pronunciation Clarity (ASR) iWhat it shows: A proxy for pronunciation clarity and accent. When the ASR system is less confident, the student's speech is usually harder for a listener to follow too. How we measure it: Mean word-level ASR confidence across the student's speaking tokens. Anchored to 0.70 → 0 and 0.97 → 100 in the Engagement subscore. | 0.892 | 0.942 | 0.939 |
Low-Confidence Words (%) iWhat it shows: How much of the student's speaking the ASR system considered comparatively uncertain. How we measure it: 100 times speaking alpha tokens with ASR confidence below 0.90 divided by all speaking alpha tokens that include a confidence score. | 31.69 | 19.075 | 19.61 |
Long Pauses >1s iWhat it shows: How often there is a noticeable pause in the student's speaking flow. How we measure it: Count of gaps greater than 1.0 second between consecutive speaking alpha tokens inside speaking turns, using word-level timestamps. | 215 | 209 | 253 |
Filler Ratio (%) iWhat it shows: How much the student relies on filler words such as um, uh, erm, hmm, or like while speaking. How we measure it: 100 times filler-word tokens divided by speaking alpha tokens. This is separate from the broader disfluency metric, which also includes immediate repetitions. | 0.653 | 0.988 | 0.893 |
Complex Sentence Rate (%) iWhat it shows: How often the student uses speaking sentences with linking or subordinating structure. How we measure it: Percentage of speaking sentences containing connector markers such as because, although, while, if, when, which, but, so, therefore, or since. | 8.434 | 17.275 | 37.685 |
Mistakes (total) iWhat it shows: How many transcript-visible mistakes were flagged in the student's speaking turns for this lesson. How we measure it: Count of detections from the rule-based grammar checker (and optionally the LLM-enhanced pass) across all student speaking turns in the lesson. | 3 | 3 | 5 |
Mistakes T1 (Core) iWhat it shows: Tier-1 core errors signal missing A2-level foundations and weigh heaviest in the Accuracy subscore. How we measure it: Count of detected mistakes labeled tier_1_core, such as subject-verb agreement slips, double negatives, or `I am agree`. | 2 | 1 | 0 |
Mistakes T2 (Intermediate) iWhat it shows: Tier-2 errors in B1-B2 structures such as perfect tenses, prepositions, or second conditionals. How we measure it: Count of detected mistakes labeled tier_2_intermediate. | 1 | 2 | 5 |
Mistakes T3 (Stretch) iWhat it shows: Tier-3 errors in C1+ structures the student is stretching into; weighted lightly. How we measure it: Count of detected mistakes labeled tier_3_stretch (mostly surfaced by the LLM-enhanced pass, rarely by rules). | 0 | 0 | 0 |
Weighted Error Rate (per 100w) iWhat it shows: Raw Accuracy input — tier-weighted mistakes normalized to a per-100-word rate. How we measure it: (3 × T1 + 2 × T2 + 1 × T3) ÷ speaking alpha tokens × 100. | 0.58 | 0.364 | 0.27 |
Tutor Correction Rate (%) iWhat it shows: How much direct tutor correction or scaffolding showed up during speaking-focused interactions. Shown as supplementary context — no longer part of the Progress Index because tutor style and student confidence both distort it. How we measure it: 100 times tutor correction-marker matches divided by tutor alpha tokens in speaking-focused interactions, using markers like you should, say, better, instead, remember, correct, mistake, pronounce, repeat, and means. | 0.425 | 1.141 | 0.642 |
Reading Turns iWhat it shows: How many student turns in the lesson were classified as reading aloud rather than free speaking. How we measure it: Count of student dialogue turns labeled as reading after the reading-vs-speaking classifier runs on the lesson transcript. | 21 | 0 | 0 |
Reading Alpha Tokens iWhat it shows: How much student word volume came from reading aloud. How we measure it: Count of alphabetic student tokens assigned to reading turns after repeat collapsing. | 386 | 0 | 0 |
Reading Duration (s) iWhat it shows: How much lesson time the student spent reading aloud. How we measure it: Sum of durations for student turns labeled as reading. | 407.86 | 0 | 0 |
Reading WPM iWhat it shows: How fast the student read aloud during reading turns. How we measure it: Reading alpha-token count divided by reading-turn minutes. This is tracked separately from speaking WPM and does not affect the Progress Index. | 56.784 | 0 | 0 |
Reading Share (%) iWhat it shows: What share of the student's total word volume came from reading aloud. How we measure it: 100 times reading alpha tokens divided by all student alpha tokens from both speaking and reading turns. | 21.87 | 0 | 0 |
Each lesson card shows the five subscores as a radar, the lesson summary, and an expandable panel with evidence plus segment-level contribution charts.
Lesson
Baseline in 1 is 52.9. Most visible signals are Accuracy, Complexity, Engagement. Main score is based on speaking turns only; reading activity is tracked separately.
Top drivers: Accuracy, Complexity, Engagement
Lesson
In 2, progress held mostly flat to 52.3 (-0.7 vs previous, -0.7 vs baseline). Biggest drivers were Lexical Range, Complexity, Accuracy.
Top drivers: Lexical Range, Complexity, Accuracy
Lesson
In 3, progress rose to 69.5 (+17.2 vs previous, +16.5 vs baseline). Biggest drivers were Lexical Range, Complexity, Engagement.
Top drivers: Lexical Range, Complexity, Engagement
Accuracy (+39.0)
Accuracy improved because tier-weighted error rate moved from 0.0 to 0.6 per 100 words, with core (T1) errors moving from 0 to 2.
Mistake: T1 · "i are"
Complexity (-27.3)
Complexity declined because average sentence length moved from 0.0 to 4.2 words and connector coverage shifted from 0.0% to 8.4%.
Student: she is the she is the who i can stay even in painting which was displayed in the settings that's who participates in the book which is the person who i made at the conference th...
Engagement (+25.5)
Engagement improved because student speaking share moved from 0.00 to 0.59 and pronunciation clarity (ASR confidence) moved from 0.000 to 0.892.
Student: she is the she is the who i can stay even in painting which was displayed in the settings that's who participates in the book which is the person who i made at the conference th...
| Segment | Duration (s) | Speaking Alpha Tokens | Student WPM | Student Talk Ratio (%) | Avg Turn Words | Response Latency (s) | Latency Pairs | Elaboration Depth | Open Prompt Responses | Avg Sentence Length | Sentences 10+ Words | MATTR-50 | Complex Words (6+ chars) | Long Words (8+ chars %) | Pronunciation Clarity (ASR) | Low-Confidence Words (%) | Long Pauses >1s | Filler Ratio (%) | Complex Sentence Rate (%) | Tutor Correction Rate (%) | Progress Index |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S01 | 299.89 | 271 | 62.223 | 50.185 | 26.7 | 0.5 | 17 | 4.25 | 8 | 5.729 | 11 | 0.688 | 52 | 7.749 | 0.86 | 37.638 | 29 | 0 | 27.083 | 1.487 | 63.838 |
| S02 | 299.435 | 29 | 10.76 | 9.477 | 7.667 | 0 | 12 | 1 | 2 | 2 | 1 | 0.586 | 3 | 3.448 | 0.878 | 48.276 | 8 | 0 | 6.667 | 0 | 32.077 |
| S03 | 299.94 | 2 | 5.508 | 100 | 1 | 8.585 | 1 | 0 | 0 | 1 | 0 | 0.5 | 0 | 0 | 0.856 | 50 | 1 | 0 | 0 | 0 | 19.936 |
| S04 | 299.94 | 131 | 28.243 | 62.679 | 11.455 | 0 | 17 | 0 | 0 | 2.62 | 1 | 0.581 | 17 | 3.053 | 0.877 | 35.115 | 28 | 0 | 1.961 | 0 | 52.199 |
| S05 | 300 | 224 | 69.088 | 86.822 | 31.857 | 0.675 | 16 | 0 | 0 | 6.371 | 3 | 0.607 | 20 | 3.125 | 0.915 | 26.339 | 28 | 1.339 | 5.714 | 0 | 51.906 |
| S06 | 299.915 | 137 | 37.387 | 64.929 | 16.5 | 0.86 | 16 | 0 | 0 | 3.703 | 2 | 0.652 | 16 | 2.19 | 0.875 | 37.226 | 24 | 0 | 5.405 | 0 | 57.831 |
| S07 | 297.615 | 244 | 63.082 | 81.333 | 22.182 | 0 | 26 | 12 | 2 | 4.519 | 4 | 0.677 | 47 | 9.426 | 0.928 | 22.541 | 36 | 1.639 | 11.111 | 0 | 63.26 |
| S08 | 299.245 | 183 | 42.393 | 66.065 | 18.1 | 0.908 | 17 | 0 | 0 | 3.851 | 4 | 0.661 | 38 | 6.011 | 0.917 | 24.044 | 25 | 1.093 | 4.255 | 0 | 51.708 |
| S09 | 297.27 | 158 | 38.586 | 72.811 | 19.375 | 0.09 | 15 | 22 | 1 | 3.925 | 3 | 0.522 | 25 | 6.962 | 0.857 | 41.139 | 36 | 0 | 2.5 | 0 | 46.823 |
Lexical Range (-28.2)
Lexical range declined because MATTR moved from 0.629 to 0.508, which reflects how much vocabulary variety showed up in the lesson.
Student: and so i know and i know that now i don't i don't understand all what you are doing for example but i know that in future it will be possible because if we if we learning if we'...
Complexity (+7.8)
Complexity improved because average sentence length moved from 4.2 to 4.7 words and connector coverage shifted from 8.4% to 17.3%.
Student: so for the rest of the day i will have many plans first of all i will plan to go outside to buy something in the shop in the shop and then i will rest i will rest because i need...
Accuracy (+5.4)
Accuracy improved because tier-weighted error rate moved from 0.6 to 0.4 per 100 words, with core (T1) errors moving from 2 to 1.
Mistake: T1 · "more better"
| Segment | Duration (s) | Speaking Alpha Tokens | Student WPM | Student Talk Ratio (%) | Avg Turn Words | Response Latency (s) | Latency Pairs | Elaboration Depth | Open Prompt Responses | Avg Sentence Length | Sentences 10+ Words | MATTR-50 | Complex Words (6+ chars) | Long Words (8+ chars %) | Pronunciation Clarity (ASR) | Low-Confidence Words (%) | Long Pauses >1s | Filler Ratio (%) | Complex Sentence Rate (%) | Tutor Correction Rate (%) | Progress Index |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S01 | 299.913 | 314 | 70.753 | 48.532 | 25.25 | 0 | 33 | 16.167 | 6 | 5.339 | 14 | 0.549 | 45 | 4.459 | 0.948 | 16.879 | 24 | 0.318 | 25.424 | 1.502 | 57.687 |
| S02 | 301.257 | 210 | 43.676 | 34.091 | 22.333 | 0 | 43 | 3 | 6 | 5.098 | 7 | 0.548 | 36 | 8.095 | 0.94 | 22.857 | 27 | 0 | 17.073 | 1.478 | 52.733 |
| S03 | 301.36 | 273 | 70.006 | 47.478 | 27 | 0 | 25 | 18 | 1 | 5.688 | 9 | 0.546 | 36 | 4.396 | 0.949 | 16.85 | 17 | 1.465 | 25 | 1.325 | 52.404 |
| S04 | 299.308 | 209 | 43.758 | 32.453 | 20.4 | 0 | 30 | 15 | 4 | 4.245 | 5 | 0.526 | 39 | 5.263 | 0.93 | 22.967 | 20 | 3.828 | 20.408 | 0.46 | 53.655 |
| S05 | 299.91 | 170 | 38.338 | 29.514 | 13.75 | 0 | 41 | 4.714 | 7 | 2.982 | 3 | 0.538 | 26 | 7.647 | 0.935 | 22.353 | 30 | 1.765 | 10.526 | 1.232 | 51.339 |
| S06 | 300.065 | 197 | 44.356 | 33.791 | 27.857 | 0.44 | 33 | 3.667 | 3 | 6.387 | 11 | 0.315 | 20 | 1.015 | 0.945 | 16.244 | 25 | 0 | 3.226 | 1.554 | 43.834 |
| S07 | 299.728 | 134 | 32.297 | 20.489 | 21.5 | 0 | 33 | 10.5 | 2 | 4.5 | 7 | 0.392 | 13 | 0.746 | 0.921 | 20.896 | 22 | 0 | 0 | 2.5 | 45.864 |
| S08 | 300.108 | 328 | 68.096 | 46.724 | 32 | 0 | 36 | 24.5 | 2 | 6.812 | 10 | 0.582 | 57 | 7.012 | 0.956 | 14.939 | 28 | 0.915 | 35.417 | 0 | 64.045 |
| S09 | 184.65 | 89 | 30.7 | 17.115 | 8.1 | 0 | 45 | 3.5 | 4 | 1.958 | 1 | 0.5 | 9 | 4.494 | 0.915 | 28.09 | 16 | 0 | 6.25 | 0 | 44.757 |
Lexical Range (+41.0)
Lexical range improved because MATTR moved from 0.508 to 0.673, which reflects how much vocabulary variety showed up in the lesson.
Student: it's and sometimes it isn't in sense that when i sometimes i need to i need to use excel yeah as a as a program when it's very it's not very demanding but and it's not very comp...
Complexity (+31.8)
Complexity improved because average sentence length moved from 4.7 to 9.1 words and connector coverage shifted from 17.3% to 37.7%.
Student: it's and sometimes it isn't in sense that when i sometimes i need to i need to use excel yeah as a as a program when it's very it's not very demanding but and it's not very comp...
Engagement (+14.0)
Engagement improved because student speaking share moved from 0.35 to 0.68 and pronunciation clarity (ASR confidence) moved from 0.942 to 0.939.
Student: it's and sometimes it isn't in sense that when i sometimes i need to i need to use excel yeah as a as a program when it's very it's not very demanding but and it's not very comp...
| Segment | Duration (s) | Speaking Alpha Tokens | Student WPM | Student Talk Ratio (%) | Avg Turn Words | Response Latency (s) | Latency Pairs | Elaboration Depth | Open Prompt Responses | Avg Sentence Length | Sentences 10+ Words | MATTR-50 | Complex Words (6+ chars) | Long Words (8+ chars %) | Pronunciation Clarity (ASR) | Low-Confidence Words (%) | Long Pauses >1s | Filler Ratio (%) | Complex Sentence Rate (%) | Tutor Correction Rate (%) | Progress Index |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S01 | 301.305 | 497 | 102.743 | 84.237 | 45 | 0 | 29 | 4.333 | 3 | 9.558 | 22 | 0.712 | 148 | 14.889 | 0.94 | 16.298 | 23 | 0.402 | 40.385 | 0 | 76.016 |
| S02 | 299.52 | 425 | 85.329 | 79.887 | 60.429 | 0 | 16 | 56.333 | 3 | 12.848 | 18 | 0.656 | 88 | 11.059 | 0.948 | 16.471 | 27 | 0.235 | 60.606 | 0.935 | 73.166 |
| S03 | 301.405 | 394 | 83.153 | 79.276 | 48.875 | 0 | 18 | 21.667 | 3 | 10.026 | 15 | 0.682 | 63 | 6.599 | 0.94 | 19.543 | 31 | 0.761 | 38.462 | 0.971 | 67.33 |
| S04 | 300.095 | 351 | 73.86 | 62.791 | 58.167 | 0.33 | 13 | 26.5 | 2 | 12.963 | 12 | 0.634 | 66 | 9.687 | 0.938 | 22.507 | 30 | 0.57 | 51.852 | 0 | 73.525 |
| S05 | 299.855 | 332 | 66.698 | 62.172 | 36.556 | 0 | 23 | 13.4 | 5 | 8.073 | 14 | 0.706 | 68 | 8.133 | 0.928 | 23.193 | 31 | 0.904 | 34.146 | 0.99 | 68.203 |
| S06 | 299.895 | 324 | 66.315 | 63.405 | 32.3 | 0.017 | 20 | 22.75 | 4 | 6.673 | 9 | 0.671 | 78 | 8.951 | 0.939 | 20.37 | 24 | 0.309 | 18.367 | 0 | 66.388 |
| S07 | 299.595 | 467 | 93.953 | 77.575 | 46.4 | 0 | 25 | 0 | 0 | 9.49 | 23 | 0.666 | 87 | 4.711 | 0.939 | 19.486 | 25 | 1.071 | 36.735 | 0 | 67.008 |
| S08 | 300.005 | 341 | 88.348 | 52.381 | 33.7 | 0 | 34 | 1 | 1 | 6.76 | 15 | 0.63 | 49 | 6.452 | 0.935 | 20.821 | 17 | 3.519 | 28 | 1.29 | 62.958 |
| S09 | 299.835 | 369 | 85.287 | 61.296 | 45.375 | 0 | 30 | 15.875 | 8 | 9.385 | 14 | 0.614 | 54 | 5.962 | 0.932 | 21.951 | 34 | 0.813 | 46.154 | 0.429 | 68.392 |
| S10 | 59.875 | 62 | 78.308 | 50 | 30 | 0 | 9 | 1 | 1 | 6.778 | 2 | 0.738 | 13 | 4.839 | 0.936 | 27.419 | 5 | 1.613 | 33.333 | 3.226 | 69.024 |
| S11 | 59.965 | 86 | 106.865 | 63.235 | 42.5 | 0.005 | 6 | 1 | 2 | 9.556 | 4 | 0.676 | 15 | 8.14 | 0.965 | 9.302 | 5 | 0 | 44.444 | 0 | 74.504 |
| S12 | 28.535 | 49 | 114.821 | 67.123 | 24 | 0 | 8 | 1 | 1 | 5.556 | 3 | 0.653 | 5 | 2.041 | 0.949 | 14.286 | 1 | 0 | 33.333 | 0 | 70.721 |
Use the top switcher to move between users, then scan one learner’s full progression on a single page: headline trend, lesson-by-lesson metrics, and a radar for every lesson.
-2.2 vs baseline
69.6 peak score
One learner story in chronological order
Lesson-level Progress Index in chronological order.
How the five fixed subscores moved lesson to lesson.
Click the i button next to any metric name to see what it means and how it is measured.
| Metric | 1 | 2 |
|---|---|---|
Progress Index iWhat it shows: The overall speaking-based lesson score used to compare the student across lessons. How we measure it: Each raw metric is mapped to 0-100 using fixed anchors (not cohort-relative), then combined into five subscores and rolled up as 25% Fluency, 25% Accuracy, 20% Complexity, 15% Lexical Range, and 15% Engagement. Reading turns are tracked separately and excluded. | 69.581 | 67.364 |
Student WPM iWhat it shows: How fast the student produced alphabetic words during speaking turns only. How we measure it: Speaking alpha-token count divided by speaking-turn minutes. Reading turns are excluded and shown separately in the reading activity section. | 93.934 | 93.017 |
Student Talk Ratio (%) iWhat it shows: How much of the speaking-focused lesson word volume came from the student instead of the tutor. How we measure it: 100 times speaking student alpha tokens divided by speaking student alpha tokens plus tutor alpha tokens from speaking-focused interactions. | 69.393 | 60.475 |
Avg Turn Words iWhat it shows: How much the student tends to say each time they take a speaking turn. How we measure it: Mean alphabetic word count per speaking student turn, recomputed over the lesson from speaking-only turns. | 44 | 49.674 |
Response Latency (s) iWhat it shows: How quickly the student answers after the tutor finishes speaking in speaking-focused exchanges. How we measure it: Median gap in seconds between a tutor turn ending and the next student speaking turn beginning, recomputed over the full lesson timeline. | 0 | 0 |
Elaboration Depth iWhat it shows: How much detail the student gives after open tutor prompts when responding in speaking mode. How we measure it: Average speaking-response length in words after tutor prompts such as why, how, tell me, or describe. | 8.643 | 3.862 |
Avg Sentence Length iWhat it shows: How long the student's speaking sentences are on average. How we measure it: Mean alphabetic token count per detected speaking sentence, recomputed over the lesson from speaking-only sentences. | 9.136 | 10.607 |
Sentences 10+ Words iWhat it shows: How many student sentences are at least ten words long. How we measure it: Count of detected student sentences whose alphabetic token count is 10 or more, summed across the lesson. | 172 | 176 |
MATTR-50 iWhat it shows: Vocabulary variety in the student's speaking turns, with less distortion from lesson length than a simple unique-word ratio. How we measure it: Moving-average type-token ratio over 50-word windows from speaking tokens only; if there are fewer than 50 speaking words, we fall back to unique tokens divided by total tokens. | 0.71 | 0.668 |
Complex Words (6+ chars) iWhat it shows: How often the student uses longer, more information-dense word forms while speaking. How we measure it: Count of speaking alpha tokens with 6 or more characters after deterministic cleanup and repeat collapsing. | 963 | 763 |
Long Words (8+ chars %) iWhat it shows: The share of speaking words that are especially long. How we measure it: 100 times speaking alpha tokens with 8 or more characters divided by all speaking alpha tokens. | 8.563 | 6.24 |
Pronunciation Clarity (ASR) iWhat it shows: A proxy for pronunciation clarity and accent. When the ASR system is less confident, the student's speech is usually harder for a listener to follow too. How we measure it: Mean word-level ASR confidence across the student's speaking tokens. Anchored to 0.70 → 0 and 0.97 → 100 in the Engagement subscore. | 0.912 | 0.932 |
Low-Confidence Words (%) iWhat it shows: How much of the student's speaking the ASR system considered comparatively uncertain. How we measure it: 100 times speaking alpha tokens with ASR confidence below 0.90 divided by all speaking alpha tokens that include a confidence score. | 26.747 | 20.684 |
Long Pauses >1s iWhat it shows: How often there is a noticeable pause in the student's speaking flow. How we measure it: Count of gaps greater than 1.0 second between consecutive speaking alpha tokens inside speaking turns, using word-level timestamps. | 373 | 243 |
Filler Ratio (%) iWhat it shows: How much the student relies on filler words such as um, uh, erm, hmm, or like while speaking. How we measure it: 100 times filler-word tokens divided by speaking alpha tokens. This is separate from the broader disfluency metric, which also includes immediate repetitions. | 0.917 | 1.017 |
Complex Sentence Rate (%) iWhat it shows: How often the student uses speaking sentences with linking or subordinating structure. How we measure it: Percentage of speaking sentences containing connector markers such as because, although, while, if, when, which, but, so, therefore, or since. | 32.112 | 42.963 |
Mistakes (total) iWhat it shows: How many transcript-visible mistakes were flagged in the student's speaking turns for this lesson. How we measure it: Count of detections from the rule-based grammar checker (and optionally the LLM-enhanced pass) across all student speaking turns in the lesson. | 2 | 4 |
Mistakes T1 (Core) iWhat it shows: Tier-1 core errors signal missing A2-level foundations and weigh heaviest in the Accuracy subscore. How we measure it: Count of detected mistakes labeled tier_1_core, such as subject-verb agreement slips, double negatives, or `I am agree`. | 1 | 3 |
Mistakes T2 (Intermediate) iWhat it shows: Tier-2 errors in B1-B2 structures such as perfect tenses, prepositions, or second conditionals. How we measure it: Count of detected mistakes labeled tier_2_intermediate. | 1 | 1 |
Mistakes T3 (Stretch) iWhat it shows: Tier-3 errors in C1+ structures the student is stretching into; weighted lightly. How we measure it: Count of detected mistakes labeled tier_3_stretch (mostly surfaced by the LLM-enhanced pass, rarely by rules). | 0 | 0 |
Weighted Error Rate (per 100w) iWhat it shows: Raw Accuracy input — tier-weighted mistakes normalized to a per-100-word rate. How we measure it: (3 × T1 + 2 × T2 + 1 × T3) ÷ speaking alpha tokens × 100. | 0.118 | 0.254 |
Tutor Correction Rate (%) iWhat it shows: How much direct tutor correction or scaffolding showed up during speaking-focused interactions. Shown as supplementary context — no longer part of the Progress Index because tutor style and student confidence both distort it. How we measure it: 100 times tutor correction-marker matches divided by tutor alpha tokens in speaking-focused interactions, using markers like you should, say, better, instead, remember, correct, mistake, pronounce, repeat, and means. | 0.8 | 0.601 |
Reading Turns iWhat it shows: How many student turns in the lesson were classified as reading aloud rather than free speaking. How we measure it: Count of student dialogue turns labeled as reading after the reading-vs-speaking classifier runs on the lesson transcript. | 0 | 0 |
Reading Alpha Tokens iWhat it shows: How much student word volume came from reading aloud. How we measure it: Count of alphabetic student tokens assigned to reading turns after repeat collapsing. | 0 | 0 |
Reading Duration (s) iWhat it shows: How much lesson time the student spent reading aloud. How we measure it: Sum of durations for student turns labeled as reading. | 0 | 0 |
Reading WPM iWhat it shows: How fast the student read aloud during reading turns. How we measure it: Reading alpha-token count divided by reading-turn minutes. This is tracked separately from speaking WPM and does not affect the Progress Index. | 0 | 0 |
Reading Share (%) iWhat it shows: What share of the student's total word volume came from reading aloud. How we measure it: 100 times reading alpha tokens divided by all student alpha tokens from both speaking and reading turns. | 0 | 0 |
Each lesson card shows the five subscores as a radar, the lesson summary, and an expandable panel with evidence plus segment-level contribution charts.
Lesson
Baseline in 1 is 69.6. Most visible signals are Accuracy, Engagement, Lexical Range.
Top drivers: Accuracy, Engagement, Lexical Range
Lesson
In 2, progress declined to 67.4 (-2.2 vs previous, -2.2 vs baseline). Biggest drivers were Lexical Range, Complexity, Accuracy.
Top drivers: Lexical Range, Complexity, Accuracy
Accuracy (+47.3)
Accuracy improved because tier-weighted error rate moved from 0.0 to 0.1 per 100 words, with core (T1) errors moving from 0 to 1.
Mistake: T1 · "it like"
Engagement (+33.0)
Engagement improved because student speaking share moved from 0.00 to 0.69 and pronunciation clarity (ASR confidence) moved from 0.000 to 0.912.
Student: right and it end up with this cold war and it adapt with kind of so socialistic communism idea including poland and nobody asked us whether we want to be there or not this was j...
Lexical Range (+25.4)
Lexical range improved because MATTR moved from 0.000 to 0.710, which reflects how much vocabulary variety showed up in the lesson.
Student: right and it end up with this cold war and it adapt with kind of so socialistic communism idea including poland and nobody asked us whether we want to be there or not this was j...
| Segment | Duration (s) | Speaking Alpha Tokens | Student WPM | Student Talk Ratio (%) | Avg Turn Words | Response Latency (s) | Latency Pairs | Elaboration Depth | Open Prompt Responses | Avg Sentence Length | Sentences 10+ Words | MATTR-50 | Complex Words (6+ chars) | Long Words (8+ chars %) | Pronunciation Clarity (ASR) | Low-Confidence Words (%) | Long Pauses >1s | Filler Ratio (%) | Complex Sentence Rate (%) | Tutor Correction Rate (%) | Progress Index |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S01 | 187.1 | 34 | 108.54 | 61.818 | 17 | 0 | 7 | 10 | 1 | 3.5 | 0 | 0.824 | 7 | 0 | 0.911 | 26.471 | 4 | 0 | 20 | 0 | 67.641 |
| S02 | 300 | 396 | 85.847 | 58.929 | 35.727 | 0 | 64 | 3.2 | 5 | 7.784 | 17 | 0.689 | 78 | 7.576 | 0.893 | 33.333 | 39 | 0 | 37.255 | 2.174 | 67.649 |
| S03 | 299.615 | 355 | 77.119 | 59.564 | 59 | 0 | 52 | 4 | 2 | 12.207 | 15 | 0.714 | 76 | 4.225 | 0.879 | 32.958 | 43 | 1.408 | 62.069 | 0.415 | 68.204 |
| S04 | 300 | 283 | 90.853 | 47.167 | 56.2 | 0 | 18 | 0 | 0 | 11.75 | 13 | 0.731 | 60 | 6.007 | 0.921 | 25.442 | 32 | 1.06 | 29.167 | 0 | 71.962 |
| S05 | 299.975 | 438 | 90.809 | 73.122 | 48.667 | 0 | 47 | 0 | 0 | 10.429 | 18 | 0.74 | 107 | 8.904 | 0.922 | 24.658 | 36 | 0 | 38.095 | 1.242 | 70.887 |
| S06 | 299.68 | 508 | 105.648 | 84.526 | 46.091 | 0 | 43 | 32 | 1 | 9.407 | 21 | 0.708 | 116 | 9.055 | 0.912 | 27.165 | 35 | 1.181 | 25.926 | 1.075 | 69.847 |
| S07 | 300 | 496 | 101.399 | 83.784 | 44.545 | 0 | 59 | 0 | 0 | 8.927 | 16 | 0.667 | 117 | 8.871 | 0.911 | 24.597 | 37 | 2.218 | 18.182 | 0 | 62.866 |
| S08 | 299.785 | 434 | 108.731 | 68.454 | 35.75 | 0 | 28 | 3 | 1 | 7.414 | 16 | 0.719 | 107 | 9.677 | 0.92 | 25.346 | 30 | 0.461 | 22.414 | 0.5 | 69.044 |
| S09 | 299.885 | 458 | 93.619 | 81.495 | 50.667 | 0 | 38 | 42 | 1 | 10.364 | 23 | 0.685 | 113 | 9.17 | 0.908 | 29.039 | 37 | 1.747 | 31.818 | 0.962 | 70.106 |
| S10 | 300 | 387 | 84.544 | 59.084 | 29.231 | 0 | 46 | 3.333 | 3 | 6.031 | 14 | 0.753 | 86 | 10.078 | 0.922 | 23.773 | 35 | 0.775 | 28.125 | 1.119 | 67.604 |
| S11 | 299.775 | 462 | 99.861 | 82.5 | 66 | 0 | 28 | 0 | 0 | 14 | 19 | 0.734 | 96 | 10.823 | 0.929 | 22.511 | 45 | 0.216 | 54.545 | 0 | 78.734 |
Lexical Range (-13.2)
Lexical range declined because MATTR moved from 0.710 to 0.668, which reflects how much vocabulary variety showed up in the lesson.
Student: it's not a big deal unless you're getting in a panic and also some people actually some people using the safety jacket which is which the life jackets which is big mistake which...
Complexity (+6.1)
Complexity improved because average sentence length moved from 9.1 to 10.6 words and connector coverage shifted from 32.1% to 43.0%.
Student: it's not a big deal unless you're getting in a panic and also some people actually some people using the safety jacket which is which the life jackets which is big mistake which...
Accuracy (-3.1)
Accuracy declined because tier-weighted error rate moved from 0.1 to 0.3 per 100 words, with core (T1) errors moving from 1 to 3.
Mistake: T1 · "it like"
| Segment | Duration (s) | Speaking Alpha Tokens | Student WPM | Student Talk Ratio (%) | Avg Turn Words | Response Latency (s) | Latency Pairs | Elaboration Depth | Open Prompt Responses | Avg Sentence Length | Sentences 10+ Words | MATTR-50 | Complex Words (6+ chars) | Long Words (8+ chars %) | Pronunciation Clarity (ASR) | Low-Confidence Words (%) | Long Pauses >1s | Filler Ratio (%) | Complex Sentence Rate (%) | Tutor Correction Rate (%) | Progress Index |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| S01 | 299.515 | 415 | 86.297 | 65.665 | 45.778 | 0 | 68 | 5.5 | 6 | 9.605 | 16 | 0.701 | 80 | 5.06 | 0.934 | 21.446 | 31 | 1.446 | 44.186 | 0.922 | 69.589 |
| S02 | 300.015 | 427 | 88.109 | 64.211 | 52.875 | 0 | 73 | 0 | 0 | 11.158 | 19 | 0.674 | 84 | 4.45 | 0.938 | 20.375 | 24 | 1.171 | 34.211 | 0 | 68.321 |
| S03 | 300.115 | 412 | 93.875 | 56.284 | 45.111 | 0 | 84 | 1.5 | 2 | 10 | 14 | 0.654 | 71 | 6.068 | 0.933 | 20.874 | 33 | 1.456 | 34.146 | 0.312 | 66.259 |
| S04 | 299.95 | 514 | 104.33 | 64.492 | 56.556 | 0 | 79 | 3 | 4 | 12.415 | 22 | 0.663 | 88 | 6.615 | 0.944 | 16.732 | 21 | 0.584 | 53.659 | 0.353 | 72.338 |
| S05 | 299.93 | 517 | 104.299 | 74.603 | 56.667 | 0 | 73 | 1 | 1 | 11.422 | 21 | 0.671 | 76 | 5.609 | 0.928 | 21.857 | 24 | 1.547 | 42.222 | 0 | 65.634 |
| S06 | 299.775 | 479 | 97.993 | 64.73 | 52.444 | 0 | 96 | 4.875 | 8 | 10.556 | 19 | 0.701 | 82 | 6.263 | 0.936 | 19.207 | 24 | 1.253 | 53.333 | 0.766 | 69.16 |
| S07 | 299 | 404 | 91.133 | 57.714 | 49.75 | 0 | 48 | 1 | 2 | 10.553 | 17 | 0.641 | 77 | 8.663 | 0.93 | 21.535 | 22 | 0.248 | 36.842 | 0 | 68.784 |
| S08 | 299.365 | 548 | 112.268 | 69.455 | 60.111 | 0 | 121 | 0 | 0 | 12.022 | 19 | 0.631 | 102 | 6.387 | 0.929 | 19.891 | 22 | 0.547 | 42.222 | 0.415 | 71.901 |
| S09 | 299.97 | 423 | 84.608 | 58.025 | 52.25 | 0 | 77 | 4.75 | 4 | 10.744 | 20 | 0.654 | 80 | 7.329 | 0.917 | 24.586 | 29 | 1.182 | 56.41 | 1.634 | 65.214 |
| S10 | 60.115 | 101 | 100.807 | 65.584 | 50.5 | 0 | 13 | 0 | 0 | 16.833 | 6 | 0.671 | 8 | 4.95 | 0.941 | 19.802 | 4 | 0.99 | 66.667 | 0 | 76.263 |
| S11 | 74.745 | 59 | 47.616 | 39.865 | 19 | 0 | 14 | 1 | 1 | 4.692 | 2 | 0.718 | 12 | 8.475 | 0.892 | 28.814 | 6 | 0 | 23.077 | 0 | 45.766 |
| S12 | 61.805 | 23 | 46.685 | 12.234 | 11 | 0 | 12 | 0 | 0 | 3.667 | 1 | 0.739 | 3 | 4.348 | 0.914 | 21.739 | 1 | 0 | 16.667 | 1.818 | 52.758 |
| S13 | 61.355 | 5 | 7.634 | 2.66 | 3 | 0 | 5 | 2 | 1 | 1.2 | 0 | 0.6 | 0 | 0 | 0.988 | 0 | 2 | 0 | 0 | 1.093 | 24.819 |
Fluency = 0.40*wpm + 0.30*inverse_disfluency + 0.30*inverse_long_pause_rateAccuracy = log-mapped tier-weighted mistake rateComplexity = 0.40*avg_sentence_len + 0.35*complex_sentence_rate + 0.25*long_words_pctLexical Range = MATTR-50Engagement = 0.50*asr_confidence + 0.25*speaking_share + 0.15*elaboration_depth + 0.10*inverse_response_latencystudent_response_latency_s = median gap between a tutor turn ending and the next student turn beginningelaboration_depth = average student response length after open tutor prompts like why, how, tell me, or describetutor_correction_rate = supplementary context only; it does not affect the Progress Indexstudent_reading_wpm = reading alpha tokens divided by reading-turn minutesstudent_reading_share_pct = share of the student's total alpha tokens that came from reading turnssubscore_fluency × 0.25subscore_accuracy × 0.25subscore_complexity × 0.20subscore_lexical_range × 0.15subscore_engagement × 0.15