Commit Graph

64 Commits

Author SHA1 Message Date
KovachevBot
c5e259c3c4
Temporarily moved note 2021-06-21 09:51:54 +01:00
KovachevBot
9541bf72ef
Note to readers 2021-06-20 17:01:40 +01:00
KovachevBot
55810d6aa4
Uploaded test outputs
These are the files that represent the only changes the bot would have actually committed if let loose 'in the wild,' as the vast majority of Bulgarian entries near the top of the category already have definitions of some kind. These are the ones without any existing Bulgarian definition. Hopefully there are no problems to see here.
2021-06-20 16:57:32 +01:00
KovachevBot
a2391fbee1
Create empty.txt 2021-06-20 16:55:57 +01:00
KovachevBot
cef647f65b
Your services are no longer required 2021-06-20 16:55:39 +01:00
KovachevBot
3a90eaa1ed
Added output test file for scrutiny
These are the text files containing the proposed wikitext that the bot would have saved on the corresponding pages for each file. Inside is what would be the code for the file (but ignoring the Bulgarian-entry check, i.e. attempting to place new derived form entries regardless of whether there already existed one or not.
2021-06-20 16:54:16 +01:00
KovachevBot
3c2ab8e328
Just creating a directory 2021-06-20 16:49:15 +01:00
KovachevBot
666aee1bc7
Removed comment on Bulgarian presence check 2021-06-20 16:35:29 +01:00
KovachevBot
074d2843ed
Added info on installation and output 2021-06-20 16:33:52 +01:00
KovachevBot
73d0a5d391
Added 'mapping_test.json' for readers' reference
This file should serve as an example for what the generated dictionary 'pages_to_create' will look like. This example is naturally in JSON, but should convey the same information as what occurs in the Python script: a dictionary with the structure seen herein is generated and subsequently used to iterate over a series of pages, providing exactly the data required to generate the contents of each page. (The example seen here is the result of harvesting the declined-form data from https://en.wiktionary.org/wiki/%D0%BA%D1%83%D0%BA%D0%BB%D0%B0.) Each page title within the dictionary contains an array under the key "associations", which itself contains a number of objects with the following contents: one array "mapping", which contains tuples of (lemma, derived form) pairs. In other words, this identifies, for each title, one derived form, with reference to the original lemma. (As there can be multiple etymologies for a derived term, by the way, this is why there are multiple lemma and derived-form fields: the stress, for example, could differ between different etymologies. In this example, they incidentally do not.); the other content being an array "forms", which contains tuples of (form, number) pairs - in other words, specifying the data of what specific type of derived form this is. The exception is when the form is the 'count form', which doesn't exist in the singular and is hence given a special designation when used in the derived form template: whereas e.g. definite singular maps simply to {{...def|s}}, the count form doesn't require a plural/singular distinction, since it's always plural; its designation is {{...count|form}}.

The dictionary generated in the article creation process is used as described above to add content to pages. Please do inspect the function of the script yourself if you are interested or would like to expound any errors. Thanks for reading!
2021-06-20 16:18:33 +01:00
KovachevBot
80d1735a6b
Final form of bot before submission
Expanded functionality of the bot to create derived forms for multiple etymologies at a time; created a loop to iterate over all lemmas in the category for "Bulgarian nouns". Probably various other improvements that I cannot recall, however.
2021-06-20 16:06:26 +01:00
KovachevBot
8f96b945d5
Uploaded the bot script file 2021-06-18 18:46:04 +01:00
KovachevBot
b18325878c
Update README.md 2021-06-18 18:44:43 +01:00
KovachevBot
e966b0f20b
Initial commit 2021-06-18 18:40:33 +01:00