Language Detection

Scitor automatically detects the language of every inbound email and form submission, then labels the GitHub issue or discussion with lang:<code> β€” for example lang:de, lang:fr, or lang:ja. At a glance your team can see which tickets need a native-language response.

Language detection is available on the Pro plan.

How it works

When a new ticket arrives, Scitor’s AI engine identifies the language from the subject and body of the message (using the first ~500 characters). The result is a two-letter ISO 639-1 code. The corresponding lang:<code> label is automatically created in your repository (if it doesn’t already exist) and applied to the issue or discussion.

ISO code Language
en English
de German
fr French
es Spanish
it Italian
nl Dutch
pt Portuguese
ja Japanese
zh Chinese
ko Korean
ru Russian

All of these are pre-seeded with a soft-blue label color (#c0e0ff). Any other valid ISO 639-1 code (e.g. ar, pl, sv) is created on demand with the same color when first encountered.

Label examples

Label Meaning
lang:en Message is in English
lang:de Message is in German
lang:fr Message is in French
lang:nl Message is in Dutch

Configuration

Language detection is enabled by default on Pro installations. You can disable it in .github/scitor.yaml:

ai:
  language_detection: false   # disable lang:* labels

To re-enable explicitly:

ai:
  language_detection: true    # default on Pro

Note: Setting ai: false (the legacy scalar form) disables all AI features including language detection. The object form ai: { language_detection: false } disables only language detection while keeping sentiment, category, and summary analysis active.

Detection accuracy

  • Detection is based on the first ~500 characters of the message body, which is sufficient for most languages.
  • Very short messages (a few words) or messages mixing multiple languages may be detected incorrectly. In those cases no label is applied rather than a wrong one β€” the model output is validated with a strict ^[a-z]{2}$ regex and silently discarded if it doesn’t match.
  • Detection runs as part of the standard AI analysis call (aiAnalyze) at no extra latency cost β€” the language field is included in the same JSON response as sentiment, category, and priority.

Plan gating

Language detection requires the Pro plan (plan_id >= 2). On Free plans, the lang:* label is never applied regardless of the ai.language_detection setting.

Was this article helpful?

Scitor β€” Turn GitHub into your support platform