-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Scribe-Data Swahili data process queries #214
Comments
CC @@LevisNgigi who expressed interest in working on this :) Can you write in here and I'll assign? Feel free to make the directory structure as you see the other languages are structured! |
Yes I can write in here and I will be glad to be assigned this. Yes I will make the directory structure as I have seen for other languages. |
Fantastic, @LevisNgigi! Looking forward to the contribution :) |
Question, I am currently querying data from Wikidata Query Service and the column for singular and plural are currently empty for the swahili language.Is it possible to get clarification on how to proceed? |
Can you paste your query, @LevisNgigi? Maybe there's little data on Wikidata right now, or it's not categorized correctly 🤔 You can also try to remove everything from the query and just get Swahili words to check if there's info there :) Nouns at the very least are usually consistent, so you'll still be able to send along your code that will work when there is data :) |
SELECT DISTINCT WHERE {
} LIMIT 100 |
You can also use src/scribe_data/check_language_data.sparql to check the data totals :) It looks like there are 203 Swahili nouns using |
By the looks of it singulars and plurals haven't been added for them yet, which is ok 😊 When I started Scribe years ago so many languages had no data. French only had two verbs with conjugations, and now there are thousands. Can you convert the lexeme over to just the LID instead of the URI, and from there I think we should be good for now :) |
You can see the conversion in other queries :) |
Yes just checked and they are only 203 and 20 verbs.Should I proceed or it needs more data? |
Proceed by all means, @LevisNgigi! There will be more data eventually :) |
For now do your best, and we can revisit the queries later 😊 |
Thank you.Really appreciate your help. |
To quote you: The pleasure is mine :) |
Hey, I would like to also work on this issue |
Hey @VNW22 👋 I think that @LevisNgigi has nouns covered. Would you want to make an adjectives query? |
yeah, happy to work on the adjectives query. |
Ok, check the one for Bengali adjectives query and make something similar in the a Swahili directiry :) |
Hey @andrewtavis i would also like to work on this. Kindly let me know if there is anyway i could contribute. |
I'll leave it to the other contributors to say if there's more work to do here :) We'll make more issues soon. |
Sure, no worries. I'll be on the lookout |
I think we have the Nouns,verb and adjectives query.I think there is no more work here for now. @GicharuElvis |
No worries. Let me have a look at the other issues. :) |
Okay :). You can also check in Scribe-android. https://github.com/scribe-org/Scribe-Android |
Or for this we could also do an adverbs one or prepositions :) Not sure on that for Swahili, but could be something to look into 😊 |
Hey all 👋 In regards to the Swahili work, I did some filtering for |
Hey @andrewtavis the filtering you used works perfectly as the Swahili that is written and spoken uses Latin script. The Arabic style of writing fizzled out with the coming of the missionaries who introduced Latin script. The Arabic-letter style is no longer in use .Also there are no other names for Swahili just that Swahili borrowed a lot from Arabic language hence the use of Arabic-letter style back in the day. |
Thanks for letting me know, @LevisNgigi! |
Just added a list of data types that we want to include to this issue :) Have marked those that are already done or have PRs open, and we can work on the others 😊 If the data type can't work, then we can move to the others and open up specific issues later :) |
Sounds great let me have a look at them now. |
8b4fead was a needed fix here, @LevisNgigi :) The queries for adverbs and prepositions were still using the QID for adjectives, so because of that we were getting adjectives back for both. Check it out so you can see the difference! |
Oh my bad, seems I forgot to change that. Thank you for correcting them. I really appreciate the work that you are doing considering all the issues that have popped up lately. |
Doing the best I can, and appreciate your support, @LevisNgigi! |
Thank you.Pleasure is mine |
This is closed up now 😊 Thanks all so much for the hard work here! |
Thank you so much for your help as well Andrew. |
Very welcome, @LevisNgigi :) |
@andrewtavis I hope its not to late, I was reviewing the adjective query that I worked on and I saw that there are forms that I did not include, I was hoping to expand the query. |
Yes, by all means, @VNW22! Really appreciate you going back through and expanding these :) We can do individual issues in the future as well 😊 |
Terms
Description
This issue would create the queries for Swahili in the src/scribe_data/language_data_extraction directory. To start we can make a nouns query and a verbs query in two separate PRs, and from there we can make new issues for other types of data. These queries can be based on the already existing queries for other languages 😊
Data types to include:
Contribution
Happy to support and answer any questions that might come up in this process! Can also review when the PRs are up :)
The text was updated successfully, but these errors were encountered: