Skip to content

Commit f73b9f5

Browse files
author
Alham Fikri Aji
committed
edit projects layout
1 parent 74d11d3 commit f73b9f5

File tree

2 files changed

+6
-17
lines changed

2 files changed

+6
-17
lines changed

assets/css/main.scss

+2-2
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,11 @@ search: false
77

88
$link-color: rgb(232, 128, 1);
99

10-
$right-sidebar-width-narrow: 100px;
10+
$right-sidebar-width-narrow: 120px;
1111
$right-sidebar-width: 180px;
1212
$right-sidebar-width-wide: 260px;
1313

14-
$doc-font-size: 8;
14+
$doc-font-size: 14;
1515

1616
@import "minimal-mistakes/skins/{{ site.minimal_mistakes_skin | default: 'default' }}"; // skin
1717
@import "minimal-mistakes"; // main partials

projects.md

+4-15
Original file line numberDiff line numberDiff line change
@@ -17,23 +17,13 @@ header:
1717
We will initiate a hackathon to centralize many NLP datasets in Indonesian and local languages. Indonesian languages are diverse and scattered, so a unified location that joins multiple sources while preserving the data closest to the original form can greatly help accessibility. We propose a unified schema for dataset extraction to implement as many datasets as possible to enable reproducibility in data processing. Stay tuned for the next update!
1818

1919
# Past Projects
20-
Currently, we have built **5 new benchmarks** to support NLP research on Indonesian languages and published papers in top NLP conferences. You can check this page for more details.
21-
22-
* [2022](https://indonlp.github.io/projects#2022)
23-
* [NusaX](https://indonlp.github.io/projects#nusax)
24-
* [One Country, 700+ Languages](https://indonlp.github.io/projects#one-country-700-languages)
25-
* [2021](https://indonlp.github.io/projects#2021)
26-
* [IndoNLG](https://indonlp.github.io/projects#indonlg)
27-
* [IndoNLI](https://indonlp.github.io/projects#indonli)
28-
* [2020](https://indonlp.github.io/projects#2020)
29-
* [IndoNLU](https://indonlp.github.io/projects#indonlu)
30-
* [IndoLEM](https://indonlp.github.io/projects#indolem)
20+
Currently, we have built **5 new benchmarks** to support NLP research on Indonesian languages and published papers in top NLP conferences, as well as providing overview of the current state of NLP research for Indonesia. You can check this page for more details.
3121

3222
## 2022
3323

34-
### NusaX
24+
### Enabling NLP research in local languages
3525

36-
NusaX is a high-quality multilingual parallel corpus for Indonesian local languages elicited by native speakers. NusaX covers 12 languages, Indonesian, English, and 10 Indonesian local languages, namely Acehnese, Balinese, Banjarese, Buginese, Madurese, Minangkabau, Javanese, Ngaju, Sundanese, and Toba Batak.
26+
NLP research in regional languages is still limited. In this project, we initiated to kickstart NLP in regional languages by creating *NusaX*: a high-quality multilingual parallel corpus for Indonesian local languages elicited by native speakers. NusaX covers 12 languages, Indonesian, English, and 10 Indonesian local languages, namely Acehnese, Balinese, Banjarese, Buginese, Madurese, Minangkabau, Javanese, Ngaju, Sundanese, and Toba Batak.
3727

3828
<i class="fas fa-book" aria-hidden="true"></i> **Paper:** NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Language [Preprint arXiv 2022](https://arxiv.org/pdf/2205.15960.pdf){: .btn .btn--info .btn--small }
3929
{: .notice}
@@ -44,9 +34,8 @@ NusaX is a high-quality multilingual parallel corpus for Indonesian local langua
4434
<i class="fas fa-database" aria-hidden="true"></i> **Dataset:** [https://github.com/IndoNLP/nusax](https://github.com/IndoNLP/nusax)
4535
{: .notice--info}
4636

47-
### One Country, 700+ Languages
4837

49-
We provide an overview of the current state of NLP research for Indonesia’s 700+ languages. We highlight challenges in Indonesian NLP and how these affect the performance of current NLP systems. Finally, we provide general recommendations to help develop NLP technology not only for languages of Indonesia but also other underrepresented languages.
38+
Additionally, We provide an overview of the current state of NLP research for Indonesia’s 700+ local languages. We highlight challenges in Indonesian NLP and how these affect the performance of current NLP systems. Finally, we provide general recommendations to help develop NLP technology not only for languages of Indonesia but also other underrepresented languages.
5039

5140
<i class="fas fa-book" aria-hidden="true"></i> **Paper:** One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia [ACL 2022](https://aclanthology.org/2022.acl-long.500.pdf){: .btn .btn--info .btn--small }
5241
{: .notice}

0 commit comments

Comments
 (0)