Language Detection

Scitor automatically detects the language of every inbound email and form submission, then labels the GitHub issue or discussion with lang:<code> — for example lang:de, lang:fr, or lang:ja. At a glance your team can see which tickets need a native-language response.

Language detection is available on the Pro plan.

How it works

When a new ticket arrives, Scitor’s AI engine identifies the language from the subject and body of the message (using the first ~500 characters). The result is a two-letter ISO 639-1 code. The corresponding lang:<code> label is automatically created in your repository (if it doesn’t already exist) and applied to the issue or discussion.

ISO code	Language
`en`	English
`de`	German
`fr`	French
`es`	Spanish
`it`	Italian
`nl`	Dutch
`pt`	Portuguese
`ja`	Japanese
`zh`	Chinese
`ko`	Korean
`ru`	Russian

All of these are pre-seeded with a soft-blue label color (#c0e0ff). Any other valid ISO 639-1 code (e.g. ar, pl, sv) is created on demand with the same color when first encountered.

Label examples

Label	Meaning
`lang:en`	Message is in English
`lang:de`	Message is in German
`lang:fr`	Message is in French
`lang:nl`	Message is in Dutch

Configuration

Language detection is enabled by default on Pro installations. You can disable it in .github/scitor.yaml:

ai:
  language_detection: false   # disable lang:* labels

To re-enable explicitly:

ai:
  language_detection: true    # default on Pro

Note: Setting ai: false (the legacy scalar form) disables all AI features including language detection. The object form ai: { language_detection: false } disables only language detection while keeping sentiment, category, and summary analysis active.

Detection accuracy

Detection is based on the first ~500 characters of the message body, which is sufficient for most languages.
Very short messages (a few words) or messages mixing multiple languages may be detected incorrectly. In those cases no label is applied rather than a wrong one — the model output is validated with a strict ^[a-z]{2}$ regex and silently discarded if it doesn’t match.
Detection runs as part of the standard AI analysis call (aiAnalyze) at no extra latency cost — the language field is included in the same JSON response as sentiment, category, and priority.

Plan gating

Language detection requires the Pro plan (plan_id >= 2). On Free plans, the lang:* label is never applied regardless of the ai.language_detection setting.

Was this article helpful?