Language Detection
Scitor automatically detects the language of every inbound email and form submission, then labels the GitHub issue or discussion with lang:<code> β for example lang:de, lang:fr, or lang:ja. At a glance your team can see which tickets need a native-language response.
Language detection is available on the Pro plan.
How it works
When a new ticket arrives, Scitorβs AI engine identifies the language from the subject and body of the message (using the first ~500 characters). The result is a two-letter ISO 639-1 code. The corresponding lang:<code> label is automatically created in your repository (if it doesnβt already exist) and applied to the issue or discussion.
| ISO code | Language |
|---|---|
en |
English |
de |
German |
fr |
French |
es |
Spanish |
it |
Italian |
nl |
Dutch |
pt |
Portuguese |
ja |
Japanese |
zh |
Chinese |
ko |
Korean |
ru |
Russian |
All of these are pre-seeded with a soft-blue label color (#c0e0ff). Any other valid ISO 639-1 code (e.g. ar, pl, sv) is created on demand with the same color when first encountered.
Label examples
| Label | Meaning |
|---|---|
lang:en |
Message is in English |
lang:de |
Message is in German |
lang:fr |
Message is in French |
lang:nl |
Message is in Dutch |
Configuration
Language detection is enabled by default on Pro installations. You can disable it in .github/scitor.yaml:
ai:
language_detection: false # disable lang:* labels
To re-enable explicitly:
ai:
language_detection: true # default on Pro
Note: Setting
ai: false(the legacy scalar form) disables all AI features including language detection. The object formai: { language_detection: false }disables only language detection while keeping sentiment, category, and summary analysis active.
Detection accuracy
- Detection is based on the first ~500 characters of the message body, which is sufficient for most languages.
- Very short messages (a few words) or messages mixing multiple languages may be detected incorrectly. In those cases no label is applied rather than a wrong one β the model output is validated with a strict
^[a-z]{2}$regex and silently discarded if it doesnβt match. - Detection runs as part of the standard AI analysis call (
aiAnalyze) at no extra latency cost β the language field is included in the same JSON response as sentiment, category, and priority.
Plan gating
Language detection requires the Pro plan (plan_id >= 2). On Free plans, the lang:* label is never applied regardless of the ai.language_detection setting.
Was this article helpful?