force all files of the same text to have the same segment IDs #2941

sujato · 2023-11-01T05:13:40Z

Currently we allow bilara files of the same text to omit segment IDs. This is handy, especially for things like references and variants where there are only a few items. However, it has created issues for us in development. For example, it makes it complicated to test for segment correctness and sort order.

After discussion with STXnext, we propose to ensure that every instance of the same text has the same segment IDs, with no omissions permitted.

this applies to everything in Bilara, i.e. to texts as well as to site, blurbs, name, etc.
even if there is only one item in a file of a thousand segment IDs, we have to list them all!

We currently allow full segment identity, but do not enforce it. So this will not introduce new situations, merely reduce some flexibility. Therefore I expect that generally this should not cause problems in our systems.

Nonetheless, the world is a weird and wonderful place so we should make sure we test it out well!

This will affect all the apps downstream of bilara-data:

SC
Voice
Publications
Third party apps

We will develop this initially in the new Bilara 2.0. Once that is ready we can test in other scenarios.

example

Let us take sn6.7 as an example.

current

sn6.7_html.json:

{
  "sn6.7:0.1": "<article id='sn6.7'><header><ul><li class='division'>{}</li>",
  "sn6.7:0.2": "<li>{}</li></ul>",
  "sn6.7:0.3": "<h1 class='sutta-title'>{}</h1></header>",
  "sn6.7:1.1": "<p>{}</p>",
  "sn6.7:1.2": "<p>{}",
  "sn6.7:1.3": "{}</p>",
  "sn6.7:1.4": "<p>{}</p>",
  "sn6.7:2.1": "<blockquote class='gatha'><p><span class='verse-line'>{}</span>",
  "sn6.7:2.2": "<span class='verse-line'>{}</span>",
  "sn6.7:2.3": "<span class='verse-line'>{}</span>",
  "sn6.7:2.4": "<span class='verse-line'>{}</span></p></blockquote></article>"
}

sn6.7_root-pli-ms.json:

{
  "sn6.7:0.1": "Saṁyutta Nikāya 6.7 ",
  "sn6.7:0.2": "1. Paṭhamavagga ",
  "sn6.7:0.3": "Kokālikasutta ",
  "sn6.7:1.1": "Sāvatthinidānaṁ. ",
  "sn6.7:1.2": "Tena kho pana samayena bhagavā divāvihāragato hoti paṭisallīno. ",
  "sn6.7:1.3": "Atha kho subrahmā ca paccekabrahmā suddhāvāso ca paccekabrahmā yena bhagavā tenupasaṅkamiṁsu; upasaṅkamitvā paccekaṁ dvārabāhaṁ nissāya aṭṭhaṁsu. ",
  "sn6.7:1.4": "Atha kho subrahmā paccekabrahmā kokālikaṁ bhikkhuṁ ārabbha bhagavato santike imaṁ gāthaṁ abhāsi: ",
  "sn6.7:2.1": "“Appameyyaṁ paminanto, ",
  "sn6.7:2.2": "Kodha vidvā vikappaye; ",
  "sn6.7:2.3": "Appameyyaṁ pamāyinaṁ, ",
  "sn6.7:2.4": "Nivutaṁ taṁ maññe puthujjanan”ti. "
}

sn6.7_variant-pli-ms.json:

{
  "sn6.7:0.3": "Kokālikasutta → kokālikasuttaṁ (1) (cck, pts2ed); paṭhamakokālikasuttaṁ (sya1ed, sya2ed) "
}

sn6.7_reference.json:

{
  "sn6.7:1.1": "ms12S1_1070, msdiv178, ndp12.149, sya15.218",
  "sn6.7:2.1": "cck15.200, ms12S1_1071, pts-vp-pli2ed1.323"
}

proposed

In this case, root and html are unchanged, but variant and reference have segments with empty values assigned.

sn6.7_html.json:

{
  "sn6.7:0.1": "<article id='sn6.7'><header><ul><li class='division'>{}</li>",
  "sn6.7:0.2": "<li>{}</li></ul>",
  "sn6.7:0.3": "<h1 class='sutta-title'>{}</h1></header>",
  "sn6.7:1.1": "<p>{}</p>",
  "sn6.7:1.2": "<p>{}",
  "sn6.7:1.3": "{}</p>",
  "sn6.7:1.4": "<p>{}</p>",
  "sn6.7:2.1": "<blockquote class='gatha'><p><span class='verse-line'>{}</span>",
  "sn6.7:2.2": "<span class='verse-line'>{}</span>",
  "sn6.7:2.3": "<span class='verse-line'>{}</span>",
  "sn6.7:2.4": "<span class='verse-line'>{}</span></p></blockquote></article>"
}

sn6.7_root-pli-ms.json:

{
  "sn6.7:0.1": "Saṁyutta Nikāya 6.7 ",
  "sn6.7:0.2": "1. Paṭhamavagga ",
  "sn6.7:0.3": "Kokālikasutta ",
  "sn6.7:1.1": "Sāvatthinidānaṁ. ",
  "sn6.7:1.2": "Tena kho pana samayena bhagavā divāvihāragato hoti paṭisallīno. ",
  "sn6.7:1.3": "Atha kho subrahmā ca paccekabrahmā suddhāvāso ca paccekabrahmā yena bhagavā tenupasaṅkamiṁsu; upasaṅkamitvā paccekaṁ dvārabāhaṁ nissāya aṭṭhaṁsu. ",
  "sn6.7:1.4": "Atha kho subrahmā paccekabrahmā kokālikaṁ bhikkhuṁ ārabbha bhagavato santike imaṁ gāthaṁ abhāsi: ",
  "sn6.7:2.1": "“Appameyyaṁ paminanto, ",
  "sn6.7:2.2": "Kodha vidvā vikappaye; ",
  "sn6.7:2.3": "Appameyyaṁ pamāyinaṁ, ",
  "sn6.7:2.4": "Nivutaṁ taṁ maññe puthujjanan”ti. "
}

sn6.7_variant-pli-ms.json:

{
  "sn6.7:0.1": "",
  "sn6.7:0.2": "",
  "sn6.7:0.3": "Kokālikasutta → kokālikasuttaṁ (1) (cck, pts2ed); paṭhamakokālikasuttaṁ (sya1ed, sya2ed) ",
  "sn6.7:1.1": "",
  "sn6.7:1.2": "",
  "sn6.7:1.3": "",
  "sn6.7:1.4": "",
  "sn6.7:2.1": "",
  "sn6.7:2.2": "",
  "sn6.7:2.3": "",
  "sn6.7:2.4": ""
}

sn6.7_reference.json:

{
  "sn6.7:0.1": "",
  "sn6.7:0.2": "",
  "sn6.7:0.3": "",
  "sn6.7:1.1": "ms12S1_1070, msdiv178, ndp12.149, sya15.218",
  "sn6.7:1.2": "",
  "sn6.7:1.3": "",
  "sn6.7:1.4": "",
  "sn6.7:2.1": "cck15.200, ms12S1_1071, pts-vp-pli2ed1.323",
  "sn6.7:2.2": "",
  "sn6.7:2.3": "",
  "sn6.7:2.4": ""
}

The text was updated successfully, but these errors were encountered:

thesunshade · 2023-11-01T05:26:58Z

As someone who likes to make things from your data, this sounds like a great thing.

firepick1 · 2023-11-01T13:16:04Z

This would allow us to delete the merging code in Voice since not all translations have the same segments. Thanks for the notification.
... I have a vague memory that Ven. Brahmali may have added extra segments (i.e., more segments than root). If so, then this might adversely impact his translations.

ihongda · 2023-11-03T03:31:59Z

Thanks, Bhante.
I'm going to check the relevant code to see if any changes need to be made.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

force all files of the same text to have the same segment IDs #2941

force all files of the same text to have the same segment IDs #2941

sujato commented Nov 1, 2023

thesunshade commented Nov 1, 2023

firepick1 commented Nov 1, 2023

ihongda commented Nov 3, 2023 •

edited

force all files of the same text to have the same segment IDs #2941

force all files of the same text to have the same segment IDs #2941

Comments

sujato commented Nov 1, 2023

example

current

proposed

thesunshade commented Nov 1, 2023

firepick1 commented Nov 1, 2023

ihongda commented Nov 3, 2023 • edited

ihongda commented Nov 3, 2023 •

edited