Agpababa para iti Training

I-export dagiti naaprubaran a datos a kas .jsonl para iti LLM training. Dagiti naaprubaran laeng a naipadakkel ti nairaman.

Approved Ilokano texts for pre-training

124 rekords

Ilokano-English sentence pairs for translation

880 rekords

Instruction-style grammar explanations

0 rekords

Ilokano-English dictionary entries

59 rekords

QA pairs, prompt completions, and dialogs

0 rekords
Sagana a Kitaen ti Pormat ti Output

Tunggal linia iti .jsonl file ket JSON object. Ti _source field ket ipakitana no ania a modulo ti naggapuanna.

// Naipadakkel a Teksto (pre-training)
{"text": "Naimbag a bigat...", "metadata": {"title": "...", "category": "Story"}, "_source": "submissions"}

// Paralelo a Sentensia (patarus)
{"ilokano": "Kumusta ka?", "english": "How are you?", "metadata": {"source": "..."}, "_source": "parallel"}

// Linteg ti Gramatika (instruction-tuning)
{"instruction": "Explain the Ilokano grammar rule: ...", "output": "...", "_source": "grammar"}

// Bokabulario (diksionario)
{"ilokano": "balay", "english": "house", "part_of_speech": "noun", "_source": "vocabulary"}

// Naikabilan ti Lohika (QA / dialog)
{"type": "qa", "question": "...", "answer": "...", "_source": "logic"}
Sapasap a Panag-export