Project Barcelona Hackathon challenge / transcript progress demo

Discover the formula for progress.

Lesson scoring, progression, and evidence panels built to tell one learner's story end-to-end — headline trend, lesson-level evidence, and a radar for every lesson, all on a single page.

Measure progression Explain the numbers Visualize the story

Judgment Criteria

The page is built to answer the brief fast.

Measure student progression Track the overall index lesson by lesson and keep the chronology obvious.

Visualize progression Use brand-colored stat blocks, line charts, and radars instead of a neutral dashboard skin.

Explain why numbers move Keep metric definitions, lesson summaries, and driver evidence directly on the page.

Show tutor scaffolding clearly Keep latency, elaboration, and coaching signals visible so the story stays actionable.

Switch users at the top, scan the summary strip, then move from trend to lesson evidence without leaving the page.

Users

Use the top switcher to move between users, then scan one learner’s full progression on a single page: headline trend, lesson-by-lesson metrics, and a radar for every lesson.

Latest Progress Index69.5

+16.5 vs baseline

Best Lesson3

69.5 peak score

Lessons Tracked3

One learner story in chronological order

Progress Trend

Lesson-level Progress Index in chronological order.

Subscore Trend

How the five fixed subscores moved lesson to lesson.

Lesson Metrics

Click the i button next to any metric name to see what it means and how it is measured.

Metric	1	2	3
Progress Index i What it shows: The overall speaking-based lesson score used to compare the student across lessons. How we measure it: Each raw metric is mapped to 0-100 using fixed anchors (not cohort-relative), then combined into five subscores and rolled up as 25% Fluency, 25% Accuracy, 20% Complexity, 15% Lexical Range, and 15% Engagement. Reading turns are tracked separately and excluded.	52.938	52.274	69.471
Student WPM i What it shows: How fast the student produced alphabetic words during speaking turns only. How we measure it: Speaking alpha-token count divided by speaking-turn minutes. Reading turns are excluded and shown separately in the reading activity section.	44.145	49.764	83.356
Student Talk Ratio (%) i What it shows: How much of the speaking-focused lesson word volume came from the student instead of the tutor. How we measure it: 100 times speaking student alpha tokens divided by speaking student alpha tokens plus tutor alpha tokens from speaking-focused interactions.	59.44	34.874	68.324
Avg Turn Words i What it shows: How much the student tends to say each time they take a speaking turn. How we measure it: Mean alphabetic word count per speaking student turn, recomputed over the lesson from speaking-only turns.	19.594	21.721	43.141
Response Latency (s) i What it shows: How quickly the student answers after the tutor finishes speaking in speaking-focused exchanges. How we measure it: Median gap in seconds between a tutor turn ending and the next student speaking turn beginning, recomputed over the full lesson timeline.	0.265	0	0
Elaboration Depth i What it shows: How much detail the student gives after open tutor prompts when responding in speaking mode. How we measure it: Average speaking-response length in words after tutor prompts such as why, how, tell me, or describe.	6.308	9.171	17.879
Avg Sentence Length i What it shows: How long the student's speaking sentences are on average. How we measure it: Mean alphabetic token count per detected speaking sentence, recomputed over the lesson from speaking-only sentences.	4.178	4.693	9.079
Sentences 10+ Words i What it shows: How many student sentences are at least ten words long. How we measure it: Count of detected student sentences whose alphabetic token count is 10 or more, summed across the lesson.	29	67	151
MATTR-50 i What it shows: Vocabulary variety in the student's speaking turns, with less distortion from lesson length than a simple unique-word ratio. How we measure it: Moving-average type-token ratio over 50-word windows from speaking tokens only; if there are fewer than 50 speaking words, we fall back to unique tokens divided by total tokens.	0.629	0.508	0.673
Complex Words (6+ chars) i What it shows: How often the student uses longer, more information-dense word forms while speaking. How we measure it: Count of speaking alpha tokens with 6 or more characters after deterministic cleanup and repeat collapsing.	218	281	734
Long Words (8+ chars %) i What it shows: The share of speaking words that are especially long. How we measure it: 100 times speaking alpha tokens with 8 or more characters divided by all speaking alpha tokens.	5.874	5.042	8.493
Pronunciation Clarity (ASR) i What it shows: A proxy for pronunciation clarity and accent. When the ASR system is less confident, the student's speech is usually harder for a listener to follow too. How we measure it: Mean word-level ASR confidence across the student's speaking tokens. Anchored to 0.70 → 0 and 0.97 → 100 in the Engagement subscore.	0.892	0.942	0.939
Low-Confidence Words (%) i What it shows: How much of the student's speaking the ASR system considered comparatively uncertain. How we measure it: 100 times speaking alpha tokens with ASR confidence below 0.90 divided by all speaking alpha tokens that include a confidence score.	31.69	19.075	19.61
Long Pauses >1s i What it shows: How often there is a noticeable pause in the student's speaking flow. How we measure it: Count of gaps greater than 1.0 second between consecutive speaking alpha tokens inside speaking turns, using word-level timestamps.	215	209	253
Filler Ratio (%) i What it shows: How much the student relies on filler words such as um, uh, erm, hmm, or like while speaking. How we measure it: 100 times filler-word tokens divided by speaking alpha tokens. This is separate from the broader disfluency metric, which also includes immediate repetitions.	0.653	0.988	0.893
Complex Sentence Rate (%) i What it shows: How often the student uses speaking sentences with linking or subordinating structure. How we measure it: Percentage of speaking sentences containing connector markers such as because, although, while, if, when, which, but, so, therefore, or since.	8.434	17.275	37.685
Mistakes (total) i What it shows: How many transcript-visible mistakes were flagged in the student's speaking turns for this lesson. How we measure it: Count of detections from the rule-based grammar checker (and optionally the LLM-enhanced pass) across all student speaking turns in the lesson.	3	3	5
Mistakes T1 (Core) i What it shows: Tier-1 core errors signal missing A2-level foundations and weigh heaviest in the Accuracy subscore. How we measure it: Count of detected mistakes labeled tier_1_core, such as subject-verb agreement slips, double negatives, or `I am agree`.	2	1	0
Mistakes T2 (Intermediate) i What it shows: Tier-2 errors in B1-B2 structures such as perfect tenses, prepositions, or second conditionals. How we measure it: Count of detected mistakes labeled tier_2_intermediate.	1	2	5
Mistakes T3 (Stretch) i What it shows: Tier-3 errors in C1+ structures the student is stretching into; weighted lightly. How we measure it: Count of detected mistakes labeled tier_3_stretch (mostly surfaced by the LLM-enhanced pass, rarely by rules).	0	0	0
Weighted Error Rate (per 100w) i What it shows: Raw Accuracy input — tier-weighted mistakes normalized to a per-100-word rate. How we measure it: (3 × T1 + 2 × T2 + 1 × T3) ÷ speaking alpha tokens × 100.	0.58	0.364	0.27
Tutor Correction Rate (%) i What it shows: How much direct tutor correction or scaffolding showed up during speaking-focused interactions. Shown as supplementary context — no longer part of the Progress Index because tutor style and student confidence both distort it. How we measure it: 100 times tutor correction-marker matches divided by tutor alpha tokens in speaking-focused interactions, using markers like you should, say, better, instead, remember, correct, mistake, pronounce, repeat, and means.	0.425	1.141	0.642
Reading Turns i What it shows: How many student turns in the lesson were classified as reading aloud rather than free speaking. How we measure it: Count of student dialogue turns labeled as reading after the reading-vs-speaking classifier runs on the lesson transcript.	21	0	0
Reading Alpha Tokens i What it shows: How much student word volume came from reading aloud. How we measure it: Count of alphabetic student tokens assigned to reading turns after repeat collapsing.	386	0	0
Reading Duration (s) i What it shows: How much lesson time the student spent reading aloud. How we measure it: Sum of durations for student turns labeled as reading.	407.86	0	0
Reading WPM i What it shows: How fast the student read aloud during reading turns. How we measure it: Reading alpha-token count divided by reading-turn minutes. This is tracked separately from speaking WPM and does not affect the Progress Index.	56.784	0	0
Reading Share (%) i What it shows: What share of the student's total word volume came from reading aloud. How we measure it: 100 times reading alpha tokens divided by all student alpha tokens from both speaking and reading turns.	21.87	0	0

Lesson Breakdown

Each lesson card shows the five subscores as a radar, the lesson summary, and an expandable panel with evidence plus segment-level contribution charts.

Lesson

1

Progress Index 52.9

quiz_grammarguided_practicecontains_readingmixed_reading_speaking

Vs previousBaseline

Vs baseline+0.0

Response latency0.26s

Elaboration depth6.3 words

Reading share21.9%

Baseline in 1 is 52.9. Most visible signals are Accuracy, Complexity, Engagement. Main score is based on speaking turns only; reading activity is tracked separately.

Top drivers: Accuracy, Complexity, Engagement

Lesson

2

Progress Index 52.3

quiz_grammarguided_practice

Vs previous-0.7

Vs baseline-0.7

Response latency0.00s

Elaboration depth9.2 words

In 2, progress held mostly flat to 52.3 (-0.7 vs previous, -0.7 vs baseline). Biggest drivers were Lexical Range, Complexity, Accuracy.

Top drivers: Lexical Range, Complexity, Accuracy

Lesson

3

Progress Index 69.5

guided_practice

Vs previous+17.2

Vs baseline+16.5

Response latency0.00s

Elaboration depth17.9 words

In 3, progress rose to 69.5 (+17.2 vs previous, +16.5 vs baseline). Biggest drivers were Lexical Range, Complexity, Engagement.

Top drivers: Lexical Range, Complexity, Engagement

1 · evidence and segment detail

Accuracy (+39.0)

Accuracy improved because tier-weighted error rate moved from 0.0 to 0.6 per 100 words, with core (T1) errors moving from 0 to 2.

Mistake: T1 · "i are"

Complexity (-27.3)

Complexity declined because average sentence length moved from 0.0 to 4.2 words and connector coverage shifted from 0.0% to 8.4%.

Student: she is the she is the who i can stay even in painting which was displayed in the settings that's who participates in the book which is the person who i made at the conference th...

Engagement (+25.5)

Engagement improved because student speaking share moved from 0.00 to 0.59 and pronunciation clarity (ASR confidence) moved from 0.000 to 0.892.

Student: she is the she is the who i can stay even in painting which was displayed in the settings that's who participates in the book which is the person who i made at the conference th...

Reading turns are tracked separately and excluded from the speaking-based Progress Index and speaking-speed metrics.
Some short segments have sparse student speech, so lexical and sentence-level metrics should be read as directional rather than definitive.
Some segments do not contain open tutor prompts, so elaboration depth is unavailable there and appears as 0 by definition.

Segment	Duration (s)	Speaking Alpha Tokens	Student WPM	Student Talk Ratio (%)	Avg Turn Words	Response Latency (s)	Latency Pairs	Elaboration Depth	Open Prompt Responses	Avg Sentence Length	Sentences 10+ Words	MATTR-50	Complex Words (6+ chars)	Long Words (8+ chars %)	Pronunciation Clarity (ASR)	Low-Confidence Words (%)	Long Pauses >1s	Filler Ratio (%)	Complex Sentence Rate (%)	Tutor Correction Rate (%)	Progress Index
S01	299.89	271	62.223	50.185	26.7	0.5	17	4.25	8	5.729	11	0.688	52	7.749	0.86	37.638	29	0	27.083	1.487	63.838
S02	299.435	29	10.76	9.477	7.667	0	12	1	2	2	1	0.586	3	3.448	0.878	48.276	8	0	6.667	0	32.077
S03	299.94	2	5.508	100	1	8.585	1	0	0	1	0	0.5	0	0	0.856	50	1	0	0	0	19.936
S04	299.94	131	28.243	62.679	11.455	0	17	0	0	2.62	1	0.581	17	3.053	0.877	35.115	28	0	1.961	0	52.199
S05	300	224	69.088	86.822	31.857	0.675	16	0	0	6.371	3	0.607	20	3.125	0.915	26.339	28	1.339	5.714	0	51.906
S06	299.915	137	37.387	64.929	16.5	0.86	16	0	0	3.703	2	0.652	16	2.19	0.875	37.226	24	0	5.405	0	57.831
S07	297.615	244	63.082	81.333	22.182	0	26	12	2	4.519	4	0.677	47	9.426	0.928	22.541	36	1.639	11.111	0	63.26
S08	299.245	183	42.393	66.065	18.1	0.908	17	0	0	3.851	4	0.661	38	6.011	0.917	24.044	25	1.093	4.255	0	51.708
S09	297.27	158	38.586	72.811	19.375	0.09	15	22	1	3.925	3	0.522	25	6.962	0.857	41.139	36	0	2.5	0	46.823

2 · evidence and segment detail

Lexical Range (-28.2)

Lexical range declined because MATTR moved from 0.629 to 0.508, which reflects how much vocabulary variety showed up in the lesson.

Student: and so i know and i know that now i don't i don't understand all what you are doing for example but i know that in future it will be possible because if we if we learning if we'...

Complexity (+7.8)

Complexity improved because average sentence length moved from 4.2 to 4.7 words and connector coverage shifted from 8.4% to 17.3%.

Student: so for the rest of the day i will have many plans first of all i will plan to go outside to buy something in the shop in the shop and then i will rest i will rest because i need...

Accuracy (+5.4)

Accuracy improved because tier-weighted error rate moved from 0.6 to 0.4 per 100 words, with core (T1) errors moving from 2 to 1.

Mistake: T1 · "more better"

Segment	Duration (s)	Speaking Alpha Tokens	Student WPM	Student Talk Ratio (%)	Avg Turn Words	Response Latency (s)	Latency Pairs	Elaboration Depth	Open Prompt Responses	Avg Sentence Length	Sentences 10+ Words	MATTR-50	Complex Words (6+ chars)	Long Words (8+ chars %)	Pronunciation Clarity (ASR)	Low-Confidence Words (%)	Long Pauses >1s	Filler Ratio (%)	Complex Sentence Rate (%)	Tutor Correction Rate (%)	Progress Index
S01	299.913	314	70.753	48.532	25.25	0	33	16.167	6	5.339	14	0.549	45	4.459	0.948	16.879	24	0.318	25.424	1.502	57.687
S02	301.257	210	43.676	34.091	22.333	0	43	3	6	5.098	7	0.548	36	8.095	0.94	22.857	27	0	17.073	1.478	52.733
S03	301.36	273	70.006	47.478	27	0	25	18	1	5.688	9	0.546	36	4.396	0.949	16.85	17	1.465	25	1.325	52.404
S04	299.308	209	43.758	32.453	20.4	0	30	15	4	4.245	5	0.526	39	5.263	0.93	22.967	20	3.828	20.408	0.46	53.655
S05	299.91	170	38.338	29.514	13.75	0	41	4.714	7	2.982	3	0.538	26	7.647	0.935	22.353	30	1.765	10.526	1.232	51.339
S06	300.065	197	44.356	33.791	27.857	0.44	33	3.667	3	6.387	11	0.315	20	1.015	0.945	16.244	25	0	3.226	1.554	43.834
S07	299.728	134	32.297	20.489	21.5	0	33	10.5	2	4.5	7	0.392	13	0.746	0.921	20.896	22	0	0	2.5	45.864
S08	300.108	328	68.096	46.724	32	0	36	24.5	2	6.812	10	0.582	57	7.012	0.956	14.939	28	0.915	35.417	0	64.045
S09	184.65	89	30.7	17.115	8.1	0	45	3.5	4	1.958	1	0.5	9	4.494	0.915	28.09	16	0	6.25	0	44.757

3 · evidence and segment detail

Lexical Range (+41.0)

Lexical range improved because MATTR moved from 0.508 to 0.673, which reflects how much vocabulary variety showed up in the lesson.

Student: it's and sometimes it isn't in sense that when i sometimes i need to i need to use excel yeah as a as a program when it's very it's not very demanding but and it's not very comp...

Complexity (+31.8)

Complexity improved because average sentence length moved from 4.7 to 9.1 words and connector coverage shifted from 17.3% to 37.7%.

Student: it's and sometimes it isn't in sense that when i sometimes i need to i need to use excel yeah as a as a program when it's very it's not very demanding but and it's not very comp...

Engagement (+14.0)

Engagement improved because student speaking share moved from 0.35 to 0.68 and pronunciation clarity (ASR confidence) moved from 0.942 to 0.939.

Student: it's and sometimes it isn't in sense that when i sometimes i need to i need to use excel yeah as a as a program when it's very it's not very demanding but and it's not very comp...

Some segments do not contain open tutor prompts, so elaboration depth is unavailable there and appears as 0 by definition.

Segment	Duration (s)	Speaking Alpha Tokens	Student WPM	Student Talk Ratio (%)	Avg Turn Words	Response Latency (s)	Latency Pairs	Elaboration Depth	Open Prompt Responses	Avg Sentence Length	Sentences 10+ Words	MATTR-50	Complex Words (6+ chars)	Long Words (8+ chars %)	Pronunciation Clarity (ASR)	Low-Confidence Words (%)	Long Pauses >1s	Filler Ratio (%)	Complex Sentence Rate (%)	Tutor Correction Rate (%)	Progress Index
S01	301.305	497	102.743	84.237	45	0	29	4.333	3	9.558	22	0.712	148	14.889	0.94	16.298	23	0.402	40.385	0	76.016
S02	299.52	425	85.329	79.887	60.429	0	16	56.333	3	12.848	18	0.656	88	11.059	0.948	16.471	27	0.235	60.606	0.935	73.166
S03	301.405	394	83.153	79.276	48.875	0	18	21.667	3	10.026	15	0.682	63	6.599	0.94	19.543	31	0.761	38.462	0.971	67.33
S04	300.095	351	73.86	62.791	58.167	0.33	13	26.5	2	12.963	12	0.634	66	9.687	0.938	22.507	30	0.57	51.852	0	73.525
S05	299.855	332	66.698	62.172	36.556	0	23	13.4	5	8.073	14	0.706	68	8.133	0.928	23.193	31	0.904	34.146	0.99	68.203
S06	299.895	324	66.315	63.405	32.3	0.017	20	22.75	4	6.673	9	0.671	78	8.951	0.939	20.37	24	0.309	18.367	0	66.388
S07	299.595	467	93.953	77.575	46.4	0	25	0	0	9.49	23	0.666	87	4.711	0.939	19.486	25	1.071	36.735	0	67.008
S08	300.005	341	88.348	52.381	33.7	0	34	1	1	6.76	15	0.63	49	6.452	0.935	20.821	17	3.519	28	1.29	62.958
S09	299.835	369	85.287	61.296	45.375	0	30	15.875	8	9.385	14	0.614	54	5.962	0.932	21.951	34	0.813	46.154	0.429	68.392
S10	59.875	62	78.308	50	30	0	9	1	1	6.778	2	0.738	13	4.839	0.936	27.419	5	1.613	33.333	3.226	69.024
S11	59.965	86	106.865	63.235	42.5	0.005	6	1	2	9.556	4	0.676	15	8.14	0.965	9.302	5	0	44.444	0	74.504
S12	28.535	49	114.821	67.123	24	0	8	1	1	5.556	3	0.653	5	2.041	0.949	14.286	1	0	33.333	0	70.721

Use the top switcher to move between users, then scan one learner’s full progression on a single page: headline trend, lesson-by-lesson metrics, and a radar for every lesson.

Latest Progress Index67.4

-2.2 vs baseline

Best Lesson1

69.6 peak score

Lessons Tracked2

One learner story in chronological order

Progress Trend

Lesson-level Progress Index in chronological order.

Subscore Trend

How the five fixed subscores moved lesson to lesson.

Lesson Metrics

Click the i button next to any metric name to see what it means and how it is measured.

Metric	1	2
Progress Index i What it shows: The overall speaking-based lesson score used to compare the student across lessons. How we measure it: Each raw metric is mapped to 0-100 using fixed anchors (not cohort-relative), then combined into five subscores and rolled up as 25% Fluency, 25% Accuracy, 20% Complexity, 15% Lexical Range, and 15% Engagement. Reading turns are tracked separately and excluded.	69.581	67.364
Student WPM i What it shows: How fast the student produced alphabetic words during speaking turns only. How we measure it: Speaking alpha-token count divided by speaking-turn minutes. Reading turns are excluded and shown separately in the reading activity section.	93.934	93.017
Student Talk Ratio (%) i What it shows: How much of the speaking-focused lesson word volume came from the student instead of the tutor. How we measure it: 100 times speaking student alpha tokens divided by speaking student alpha tokens plus tutor alpha tokens from speaking-focused interactions.	69.393	60.475
Avg Turn Words i What it shows: How much the student tends to say each time they take a speaking turn. How we measure it: Mean alphabetic word count per speaking student turn, recomputed over the lesson from speaking-only turns.	44	49.674
Response Latency (s) i What it shows: How quickly the student answers after the tutor finishes speaking in speaking-focused exchanges. How we measure it: Median gap in seconds between a tutor turn ending and the next student speaking turn beginning, recomputed over the full lesson timeline.	0	0
Elaboration Depth i What it shows: How much detail the student gives after open tutor prompts when responding in speaking mode. How we measure it: Average speaking-response length in words after tutor prompts such as why, how, tell me, or describe.	8.643	3.862
Avg Sentence Length i What it shows: How long the student's speaking sentences are on average. How we measure it: Mean alphabetic token count per detected speaking sentence, recomputed over the lesson from speaking-only sentences.	9.136	10.607
Sentences 10+ Words i What it shows: How many student sentences are at least ten words long. How we measure it: Count of detected student sentences whose alphabetic token count is 10 or more, summed across the lesson.	172	176
MATTR-50 i What it shows: Vocabulary variety in the student's speaking turns, with less distortion from lesson length than a simple unique-word ratio. How we measure it: Moving-average type-token ratio over 50-word windows from speaking tokens only; if there are fewer than 50 speaking words, we fall back to unique tokens divided by total tokens.	0.71	0.668
Complex Words (6+ chars) i What it shows: How often the student uses longer, more information-dense word forms while speaking. How we measure it: Count of speaking alpha tokens with 6 or more characters after deterministic cleanup and repeat collapsing.	963	763
Long Words (8+ chars %) i What it shows: The share of speaking words that are especially long. How we measure it: 100 times speaking alpha tokens with 8 or more characters divided by all speaking alpha tokens.	8.563	6.24
Pronunciation Clarity (ASR) i What it shows: A proxy for pronunciation clarity and accent. When the ASR system is less confident, the student's speech is usually harder for a listener to follow too. How we measure it: Mean word-level ASR confidence across the student's speaking tokens. Anchored to 0.70 → 0 and 0.97 → 100 in the Engagement subscore.	0.912	0.932
Low-Confidence Words (%) i What it shows: How much of the student's speaking the ASR system considered comparatively uncertain. How we measure it: 100 times speaking alpha tokens with ASR confidence below 0.90 divided by all speaking alpha tokens that include a confidence score.	26.747	20.684
Long Pauses >1s i What it shows: How often there is a noticeable pause in the student's speaking flow. How we measure it: Count of gaps greater than 1.0 second between consecutive speaking alpha tokens inside speaking turns, using word-level timestamps.	373	243
Filler Ratio (%) i What it shows: How much the student relies on filler words such as um, uh, erm, hmm, or like while speaking. How we measure it: 100 times filler-word tokens divided by speaking alpha tokens. This is separate from the broader disfluency metric, which also includes immediate repetitions.	0.917	1.017
Complex Sentence Rate (%) i What it shows: How often the student uses speaking sentences with linking or subordinating structure. How we measure it: Percentage of speaking sentences containing connector markers such as because, although, while, if, when, which, but, so, therefore, or since.	32.112	42.963
Mistakes (total) i What it shows: How many transcript-visible mistakes were flagged in the student's speaking turns for this lesson. How we measure it: Count of detections from the rule-based grammar checker (and optionally the LLM-enhanced pass) across all student speaking turns in the lesson.	2	4
Mistakes T1 (Core) i What it shows: Tier-1 core errors signal missing A2-level foundations and weigh heaviest in the Accuracy subscore. How we measure it: Count of detected mistakes labeled tier_1_core, such as subject-verb agreement slips, double negatives, or `I am agree`.	1	3
Mistakes T2 (Intermediate) i What it shows: Tier-2 errors in B1-B2 structures such as perfect tenses, prepositions, or second conditionals. How we measure it: Count of detected mistakes labeled tier_2_intermediate.	1	1
Mistakes T3 (Stretch) i What it shows: Tier-3 errors in C1+ structures the student is stretching into; weighted lightly. How we measure it: Count of detected mistakes labeled tier_3_stretch (mostly surfaced by the LLM-enhanced pass, rarely by rules).	0	0
Weighted Error Rate (per 100w) i What it shows: Raw Accuracy input — tier-weighted mistakes normalized to a per-100-word rate. How we measure it: (3 × T1 + 2 × T2 + 1 × T3) ÷ speaking alpha tokens × 100.	0.118	0.254
Tutor Correction Rate (%) i What it shows: How much direct tutor correction or scaffolding showed up during speaking-focused interactions. Shown as supplementary context — no longer part of the Progress Index because tutor style and student confidence both distort it. How we measure it: 100 times tutor correction-marker matches divided by tutor alpha tokens in speaking-focused interactions, using markers like you should, say, better, instead, remember, correct, mistake, pronounce, repeat, and means.	0.8	0.601
Reading Turns i What it shows: How many student turns in the lesson were classified as reading aloud rather than free speaking. How we measure it: Count of student dialogue turns labeled as reading after the reading-vs-speaking classifier runs on the lesson transcript.	0	0
Reading Alpha Tokens i What it shows: How much student word volume came from reading aloud. How we measure it: Count of alphabetic student tokens assigned to reading turns after repeat collapsing.	0	0
Reading Duration (s) i What it shows: How much lesson time the student spent reading aloud. How we measure it: Sum of durations for student turns labeled as reading.	0	0
Reading WPM i What it shows: How fast the student read aloud during reading turns. How we measure it: Reading alpha-token count divided by reading-turn minutes. This is tracked separately from speaking WPM and does not affect the Progress Index.	0	0
Reading Share (%) i What it shows: What share of the student's total word volume came from reading aloud. How we measure it: 100 times reading alpha tokens divided by all student alpha tokens from both speaking and reading turns.	0	0

Lesson Breakdown

Each lesson card shows the five subscores as a radar, the lesson summary, and an expandable panel with evidence plus segment-level contribution charts.

Lesson

1

Progress Index 69.6

guided_practice

Vs previousBaseline

Vs baseline+0.0

Response latency0.00s

Elaboration depth8.6 words

Baseline in 1 is 69.6. Most visible signals are Accuracy, Engagement, Lexical Range.

Top drivers: Accuracy, Engagement, Lexical Range

Lesson

2

Progress Index 67.4

guided_practice

Vs previous-2.2

Vs baseline-2.2

Response latency0.00s

Elaboration depth3.9 words

In 2, progress declined to 67.4 (-2.2 vs previous, -2.2 vs baseline). Biggest drivers were Lexical Range, Complexity, Accuracy.

Top drivers: Lexical Range, Complexity, Accuracy

1 · evidence and segment detail

Accuracy (+47.3)

Accuracy improved because tier-weighted error rate moved from 0.0 to 0.1 per 100 words, with core (T1) errors moving from 0 to 1.

Mistake: T1 · "it like"

Engagement (+33.0)

Engagement improved because student speaking share moved from 0.00 to 0.69 and pronunciation clarity (ASR confidence) moved from 0.000 to 0.912.

Student: right and it end up with this cold war and it adapt with kind of so socialistic communism idea including poland and nobody asked us whether we want to be there or not this was j...

Lexical Range (+25.4)

Lexical range improved because MATTR moved from 0.000 to 0.710, which reflects how much vocabulary variety showed up in the lesson.

Student: right and it end up with this cold war and it adapt with kind of so socialistic communism idea including poland and nobody asked us whether we want to be there or not this was j...

Some segments do not contain open tutor prompts, so elaboration depth is unavailable there and appears as 0 by definition.

Segment	Duration (s)	Speaking Alpha Tokens	Student WPM	Student Talk Ratio (%)	Avg Turn Words	Latency Pairs	Elaboration Depth	Open Prompt Responses	Avg Sentence Length	Sentences 10+ Words	MATTR-50	Complex Words (6+ chars)	Long Words (8+ chars %)	Pronunciation Clarity (ASR)	Low-Confidence Words (%)	Long Pauses >1s	Filler Ratio (%)	Complex Sentence Rate (%)	Tutor Correction Rate (%)	Progress Index
S01	187.1	34	108.54	61.818	17	7	10	1	3.5	0	0.824	7	0	0.911	26.471	4	0	20	0	67.641
S02	300	396	85.847	58.929	35.727	64	3.2	5	7.784	17	0.689	78	7.576	0.893	33.333	39	0	37.255	2.174	67.649
S03	299.615	355	77.119	59.564	59	52	4	2	12.207	15	0.714	76	4.225	0.879	32.958	43	1.408	62.069	0.415	68.204
S04	300	283	90.853	47.167	56.2	18	0	0	11.75	13	0.731	60	6.007	0.921	25.442	32	1.06	29.167	0	71.962
S05	299.975	438	90.809	73.122	48.667	47	0	0	10.429	18	0.74	107	8.904	0.922	24.658	36	0	38.095	1.242	70.887
S06	299.68	508	105.648	84.526	46.091	43	32	1	9.407	21	0.708	116	9.055	0.912	27.165	35	1.181	25.926	1.075	69.847
S07	300	496	101.399	83.784	44.545	59	0	0	8.927	16	0.667	117	8.871	0.911	24.597	37	2.218	18.182	0	62.866
S08	299.785	434	108.731	68.454	35.75	28	3	1	7.414	16	0.719	107	9.677	0.92	25.346	30	0.461	22.414	0.5	69.044
S09	299.885	458	93.619	81.495	50.667	38	42	1	10.364	23	0.685	113	9.17	0.908	29.039	37	1.747	31.818	0.962	70.106
S10	300	387	84.544	59.084	29.231	46	3.333	3	6.031	14	0.753	86	10.078	0.922	23.773	35	0.775	28.125	1.119	67.604
S11	299.775	462	99.861	82.5	66	28	0	0	14	19	0.734	96	10.823	0.929	22.511	45	0.216	54.545	0	78.734

2 · evidence and segment detail

Lexical Range (-13.2)

Lexical range declined because MATTR moved from 0.710 to 0.668, which reflects how much vocabulary variety showed up in the lesson.

Student: it's not a big deal unless you're getting in a panic and also some people actually some people using the safety jacket which is which the life jackets which is big mistake which...

Complexity (+6.1)

Complexity improved because average sentence length moved from 9.1 to 10.6 words and connector coverage shifted from 32.1% to 43.0%.

Student: it's not a big deal unless you're getting in a panic and also some people actually some people using the safety jacket which is which the life jackets which is big mistake which...

Accuracy (-3.1)

Accuracy declined because tier-weighted error rate moved from 0.1 to 0.3 per 100 words, with core (T1) errors moving from 1 to 3.

Mistake: T1 · "it like"

Some short segments have sparse student speech, so lexical and sentence-level metrics should be read as directional rather than definitive.
Some segments do not contain open tutor prompts, so elaboration depth is unavailable there and appears as 0 by definition.

Segment	Duration (s)	Speaking Alpha Tokens	Student WPM	Student Talk Ratio (%)	Avg Turn Words	Latency Pairs	Elaboration Depth	Open Prompt Responses	Avg Sentence Length	Sentences 10+ Words	MATTR-50	Complex Words (6+ chars)	Long Words (8+ chars %)	Pronunciation Clarity (ASR)	Low-Confidence Words (%)	Long Pauses >1s	Filler Ratio (%)	Complex Sentence Rate (%)	Tutor Correction Rate (%)	Progress Index
S01	299.515	415	86.297	65.665	45.778	68	5.5	6	9.605	16	0.701	80	5.06	0.934	21.446	31	1.446	44.186	0.922	69.589
S02	300.015	427	88.109	64.211	52.875	73	0	0	11.158	19	0.674	84	4.45	0.938	20.375	24	1.171	34.211	0	68.321
S03	300.115	412	93.875	56.284	45.111	84	1.5	2	10	14	0.654	71	6.068	0.933	20.874	33	1.456	34.146	0.312	66.259
S04	299.95	514	104.33	64.492	56.556	79	3	4	12.415	22	0.663	88	6.615	0.944	16.732	21	0.584	53.659	0.353	72.338
S05	299.93	517	104.299	74.603	56.667	73	1	1	11.422	21	0.671	76	5.609	0.928	21.857	24	1.547	42.222	0	65.634
S06	299.775	479	97.993	64.73	52.444	96	4.875	8	10.556	19	0.701	82	6.263	0.936	19.207	24	1.253	53.333	0.766	69.16
S07	299	404	91.133	57.714	49.75	48	1	2	10.553	17	0.641	77	8.663	0.93	21.535	22	0.248	36.842	0	68.784
S08	299.365	548	112.268	69.455	60.111	121	0	0	12.022	19	0.631	102	6.387	0.929	19.891	22	0.547	42.222	0.415	71.901
S09	299.97	423	84.608	58.025	52.25	77	4.75	4	10.744	20	0.654	80	7.329	0.917	24.586	29	1.182	56.41	1.634	65.214
S10	60.115	101	100.807	65.584	50.5	13	0	0	16.833	6	0.671	8	4.95	0.941	19.802	4	0.99	66.667	0	76.263
S11	74.745	59	47.616	39.865	19	14	1	1	4.692	2	0.718	12	8.475	0.892	28.814	6	0	23.077	0	45.766
S12	61.805	23	46.685	12.234	11	12	0	0	3.667	1	0.739	3	4.348	0.914	21.739	1	0	16.667	1.818	52.758
S13	61.355	5	7.634	2.66	3	5	2	1	1.2	0	0.6	0	0	0.988	0	2	0	0	1.093	24.819

Scoring Formula

Fluency = 0.40*wpm + 0.30*inverse_disfluency + 0.30*inverse_long_pause_rate
Accuracy = log-mapped tier-weighted mistake rate
Complexity = 0.40*avg_sentence_len + 0.35*complex_sentence_rate + 0.25*long_words_pct
Lexical Range = MATTR-50
Engagement = 0.50*asr_confidence + 0.25*speaking_share + 0.15*elaboration_depth + 0.10*inverse_response_latency

student_response_latency_s = median gap between a tutor turn ending and the next student turn beginning
elaboration_depth = average student response length after open tutor prompts like why, how, tell me, or describe
tutor_correction_rate = supplementary context only; it does not affect the Progress Index

student_reading_wpm = reading alpha tokens divided by reading-turn minutes
student_reading_share_pct = share of the student's total alpha tokens that came from reading turns

subscore_fluency × 0.25
subscore_accuracy × 0.25
subscore_complexity × 0.20
subscore_lexical_range × 0.15
subscore_engagement × 0.15