Open In Colab  View Notebook on GitHub

๐Ÿค” ๅˆๅญฆ่€…ไฝฟ็”จ BERT ๅพฎ่ฐƒ NER ๆจกๅž‹#

ๆ‚จๆ˜ฏๅˆๅญฆ่€…ๅ—๏ผŸๆ‚จๆƒณๅญฆไน ๏ผŒไฝ†ไธ็Ÿฅ้“ไปŽๅ“ช้‡Œๅผ€ๅง‹๏ผŸๅœจๆœฌๆ•™็จ‹ไธญ๏ผŒๆ‚จๅฐ†ๅญฆไน ๅฆ‚ไฝ•ไธบๅ‘ฝๅๅฎžไฝ“่ฏ†ๅˆซๅพฎ่ฐƒ้ข„่ฎญ็ปƒ็š„ BERT ๆจกๅž‹ใ€‚ๅฎƒๅฐ†ๅผ•ๅฏผๆ‚จๅฎŒๆˆไปฅไธ‹ๆญฅ้ชค

  • ๐Ÿš€ ๅฐ†ๆ‚จ็š„่ฎญ็ปƒๆ•ฐๆฎ้›†ๅŠ ่ฝฝๅˆฐ Argilla ไธญ๏ผŒๅนถไฝฟ็”จๅ…ถๅทฅๅ…ท่ฟ›่กŒๆŽข็ดขใ€‚

  • โณ ้ข„ๅค„็†ๆ•ฐๆฎไปฅ็”Ÿๆˆๆจกๅž‹ๆ‰€้œ€็š„ๅ…ถไป–่พ“ๅ…ฅ๏ผŒๅนถๅฐ†ๅฎƒไปฌๆ”พๅ…ฅๆจกๅž‹ๆœŸๆœ›็š„ๆ ผๅผไธญใ€‚

  • ๐Ÿ” ไธ‹่ฝฝ BERT ๆจกๅž‹ๅนถๅผ€ๅง‹ๅพฎ่ฐƒๅฎƒใ€‚

  • ๐Ÿงช ๆ‰ง่กŒๆ‚จ่‡ชๅทฑ็š„ๆต‹่ฏ•๏ผ

NER

็ฎ€ไป‹#

ๆˆ‘ไปฌ็š„็›ฎๆ ‡ๆ˜ฏไปŽ่ฎญ็ปƒๆ•ฐๆฎ้›†ๅฑ•็คบๅฆ‚ไฝ•ๅพฎ่ฐƒไธ€ไธชๅพฎๅฐ็š„ BERT ๆจกๅž‹๏ผŒไปฅไพฟ่ฏ†ๅˆซ NER ๆ ‡็ญพใ€‚

ไธบๆญค๏ผŒๆˆ‘ไปฌๅฐ†้ฆ–ๅ…ˆ่ฟžๆŽฅๅˆฐ Argilla ๅนถ่ฎฐๅฝ•ๆˆ‘ไปฌ็š„ ๆ•ฐๆฎ้›†๏ผŒไปฅไพฟๆˆ‘ไปฌๅฏไปฅ็”จๆ›ด็›ด่ง‚็š„ๆ–นๅผๅˆ†ๆžๅฎƒใ€‚

๐Ÿ’ก ๆ็คบ๏ผš ๅฆ‚ๆžœๆ‚จๆƒณๅฐ่ฏ•ไฝฟ็”จไธŽๆœฌๆ•™็จ‹ไธญไธๅŒ็š„ๆ•ฐๆฎ้›†๏ผŒไฝ†ๅฎƒๅฐšๆœชๆ ‡ๆณจ๏ผŒArgilla ๆœ‰ๅ‡ ไธชๅ…ณไบŽๅฆ‚ไฝ• ๆ‰‹ๅŠจ ๆˆ– ่‡ชๅŠจ ่ฟ›่กŒๆ ‡ๆณจ็š„ๆ•™็จ‹ใ€‚

ๆŽฅไธ‹ๆฅ๏ผŒๆˆ‘ไปฌๅฐ†้ข„ๅค„็†ๆˆ‘ไปฌ็š„ๆ•ฐๆฎ้›†ๅนถๅพฎ่ฐƒๆจกๅž‹ใ€‚่ฟ™้‡Œๆˆ‘ไปฌๅฐ†ไฝฟ็”จ DistilBERT๏ผŒไฝฟๅ…ถๆ›ดๅฎนๆ˜“็†่งฃๅนถ่ฝปๆพๅผ€ๅง‹็Žฉ่ฝฌๅ‚ๆ•ฐใ€‚ไฝ†ๆ˜ฏ๏ผŒไป็„ถๆœ‰ๅพˆๅคš็ฑปไผผ็š„ ๅฏไพ›ๆŽข็ดขใ€‚

โœจ่ฎฉๆˆ‘ไปฌๅผ€ๅง‹ๅง๏ผ

่ฟ่กŒ Argilla#

ๅฏนไบŽๆœฌๆ•™็จ‹๏ผŒๆ‚จ้œ€่ฆ่ฟ่กŒ Argilla ๆœๅŠกๅ™จใ€‚้ƒจ็ฝฒๅ’Œ่ฟ่กŒ Argilla ๆœ‰ไธคไธชไธป่ฆ้€‰้กน

  1. ๅœจ Hugging Face Spaces ไธŠ้ƒจ็ฝฒ Argilla๏ผšๅฆ‚ๆžœๆ‚จๅœจ Hugging Face ไธŠๆœ‰ๅธๆˆท๏ผŒ่ฟ™ๆ˜ฏๆœ€ๅฟซ็š„้€‰้กน๏ผŒไนŸๆ˜ฏ่ฟžๆŽฅๅˆฐๅค–้ƒจ็ฌ”่ฎฐๆœฌ๏ผˆไพ‹ๅฆ‚๏ผŒGoogle Colab๏ผ‰็š„ๆŽจ่้€‰ๆ‹ฉใ€‚

deploy on spaces

  1. ไฝฟ็”จ Argilla ็š„ๅฟซ้€Ÿๅ…ฅ้—จ Docker ้•œๅƒๅฏๅŠจ Argilla๏ผšๅฆ‚ๆžœๆ‚จๅธŒๆœ› Argilla ๅœจๆœฌๅœฐ่ฎก็ฎ—ๆœบไธŠ่ฟ่กŒ๏ผŒ่ฟ™ๆ˜ฏๆŽจ่็š„้€‰้กนใ€‚่ฏทๆณจๆ„๏ผŒๆญค้€‰้กนไป…ๅ…่ฎธๆ‚จๅœจๆœฌๅœฐ่ฟ่กŒๆœฌๆ•™็จ‹๏ผŒ่€Œไธ่ƒฝไธŽๅค–้ƒจ็ฌ”่ฎฐๆœฌๆœๅŠกไธ€่ตท่ฟ่กŒใ€‚

ๆœ‰ๅ…ณ้ƒจ็ฝฒ้€‰้กน็š„ๆ›ดๅคšไฟกๆฏ๏ผŒ่ฏทๆŸฅ็œ‹ๆ–‡ๆกฃ็š„้ƒจ็ฝฒ้ƒจๅˆ†ใ€‚

๐Ÿคฏ ๆ็คบ

ๆœฌๆ•™็จ‹ๆ˜ฏไธ€ไธช Jupyter Notebookใ€‚ๆœ‰ไธค็ง่ฟ่กŒๆ–นๅผ๏ผš - ไฝฟ็”จๆญค้กต้ข้กถ้ƒจ็š„โ€œๅœจ Colab ไธญๆ‰“ๅผ€โ€ๆŒ‰้’ฎใ€‚ๆญค้€‰้กนๅ…่ฎธๆ‚จ็›ดๆŽฅๅœจ Google Colab ไธŠ่ฟ่กŒ notebookใ€‚ไธ่ฆๅฟ˜่ฎฐๅฐ†่ฟ่กŒๆ—ถ็ฑปๅž‹ๆ›ดๆ”นไธบ GPU ไปฅๅŠ ๅฟซๆจกๅž‹่ฎญ็ปƒๅ’ŒๆŽจ็†้€Ÿๅบฆใ€‚ - ้€š่ฟ‡ๅ•ๅ‡ป้กต้ข้กถ้ƒจ็š„โ€œๆŸฅ็œ‹ๆบไปฃ็ โ€้“พๆŽฅไธ‹่ฝฝ .ipynb ๆ–‡ไปถใ€‚ๆญค้€‰้กนๅ…่ฎธๆ‚จไธ‹่ฝฝ notebook ๅนถๅœจๆœฌๅœฐ่ฎก็ฎ—ๆœบๆˆ–ๆ‚จ้€‰ๆ‹ฉ็š„ Jupyter notebook ๅทฅๅ…ทไธŠ่ฟ่กŒๅฎƒใ€‚

่ฎพ็ฝฎ#

ๅฏนไบŽๆœฌๆ•™็จ‹๏ผŒๆ‚จ้œ€่ฆไฝฟ็”จ pip ๅฎ‰่ฃ… Argilla ๅฎขๆˆท็ซฏๅ’Œไธ€ไบ›็ฌฌไธ‰ๆ–นๅบ“

[1]:
%pip install "argilla[server]==1.5.0" -qqq
%pip install datasets
%pip install transformers
%pip install evaluate
%pip install seqeval
%pip install transformers[torch]
%pip install accelerate -U
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 2.0/2.0 MB 11.1 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 71.5/71.5 kB 8.9 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 238.1/238.1 kB 15.2 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 55.5/55.5 kB 7.1 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 214.7/214.7 kB 14.7 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 385.3/385.3 kB 17.0 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 56.9/56.9 kB 5.9 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 51.6/51.6 kB 7.2 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 45.7/45.7 kB 6.1 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 525.6/525.6 kB 18.3 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 69.9/69.9 kB 9.4 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 2.7/2.7 MB 25.6 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 59.5/59.5 kB 8.7 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 3.1/3.1 MB 33.9 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 64.3/64.3 kB 8.4 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 69.6/69.6 kB 9.3 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 49.6/49.6 kB 6.9 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 143.1/143.1 kB 19.1 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 593.7/593.7 kB 29.5 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 142.9/142.9 kB 20.5 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 51.1/51.1 kB 6.5 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 58.3/58.3 kB 8.0 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 428.8/428.8 kB 36.5 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 4.1/4.1 MB 49.8 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.3/1.3 MB 50.5 MB/s eta 0:00:00
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 129.9/129.9 kB 16.5 MB/s eta 0:00:00
Collecting datasets
  Downloading datasets-2.14.4-py3-none-any.whl (519 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 519.3/519.3 kB 8.5 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from datasets) (1.23.5)
Requirement already satisfied: pyarrow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets) (9.0.0)
Collecting dill<0.3.8,>=0.3.0 (from datasets)
  Downloading dill-0.3.7-py3-none-any.whl (115 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 115.3/115.3 kB 14.8 MB/s eta 0:00:00
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets) (1.5.3)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from datasets) (2.31.0)
Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.10/dist-packages (from datasets) (4.66.1)
Collecting xxhash (from datasets)
  Downloading xxhash-3.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 194.1/194.1 kB 22.8 MB/s eta 0:00:00
Collecting multiprocess (from datasets)
  Downloading multiprocess-0.70.15-py310-none-any.whl (134 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 134.8/134.8 kB 18.1 MB/s eta 0:00:00
Requirement already satisfied: fsspec[http]>=2021.11.1 in /usr/local/lib/python3.10/dist-packages (from datasets) (2023.6.0)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets) (3.8.5)
Collecting huggingface-hub<1.0.0,>=0.14.0 (from datasets)
  Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 268.8/268.8 kB 35.3 MB/s eta 0:00:00
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from datasets) (23.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets) (6.0.1)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (23.1.0)
Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (3.2.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (4.0.3)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.9.2)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.4.0)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets) (1.3.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0.0,>=0.14.0->datasets) (3.12.2)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0.0,>=0.14.0->datasets) (4.7.1)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets) (2023.7.22)
Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets) (2023.3)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas->datasets) (1.16.0)
Installing collected packages: xxhash, dill, multiprocess, huggingface-hub, datasets
Successfully installed datasets-2.14.4 dill-0.3.7 huggingface-hub-0.16.4 multiprocess-0.70.15 xxhash-3.3.0
Collecting transformers
  Downloading transformers-4.33.0-py3-none-any.whl (7.6 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 7.6/7.6 MB 19.2 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.12.2)
Requirement already satisfied: huggingface-hub<1.0,>=0.15.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.16.4)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (1.23.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (23.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2023.6.3)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.31.0)
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1 (from transformers)
  Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 7.8/7.8 MB 48.2 MB/s eta 0:00:00
Collecting safetensors>=0.3.1 (from transformers)
  Downloading safetensors-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 1.3/1.3 MB 53.1 MB/s eta 0:00:00
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers) (4.66.1)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.15.1->transformers) (2023.6.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.15.1->transformers) (4.7.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.2.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (2023.7.22)
Installing collected packages: tokenizers, safetensors, transformers
Successfully installed safetensors-0.3.3 tokenizers-0.13.3 transformers-4.33.0
Collecting evaluate
  Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 81.4/81.4 kB 2.4 MB/s eta 0:00:00
Requirement already satisfied: datasets>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from evaluate) (2.14.4)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from evaluate) (1.23.5)
Requirement already satisfied: dill in /usr/local/lib/python3.10/dist-packages (from evaluate) (0.3.7)
Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from evaluate) (1.5.3)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from evaluate) (2.31.0)
Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.10/dist-packages (from evaluate) (4.66.1)
Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from evaluate) (3.3.0)
Requirement already satisfied: multiprocess in /usr/local/lib/python3.10/dist-packages (from evaluate) (0.70.15)
Requirement already satisfied: fsspec[http]>=2021.05.0 in /usr/local/lib/python3.10/dist-packages (from evaluate) (2023.6.0)
Requirement already satisfied: huggingface-hub>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from evaluate) (0.16.4)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from evaluate) (23.1)
Collecting responses<0.19 (from evaluate)
  Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Requirement already satisfied: pyarrow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.0.0->evaluate) (9.0.0)
Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets>=2.0.0->evaluate) (3.8.5)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.0.0->evaluate) (6.0.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.7.0->evaluate) (3.12.2)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.7.0->evaluate) (4.7.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (3.2.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (2023.7.22)
Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas->evaluate) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->evaluate) (2023.3)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (23.1.0)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (4.0.3)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.9.2)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.4.0)
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.3.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas->evaluate) (1.16.0)
Installing collected packages: responses, evaluate
Successfully installed evaluate-0.4.0 responses-0.18.0
Collecting seqeval
  Downloading seqeval-1.2.2.tar.gz (43 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 43.6/43.6 kB 1.5 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Requirement already satisfied: numpy>=1.14.0 in /usr/local/lib/python3.10/dist-packages (from seqeval) (1.23.5)
Requirement already satisfied: scikit-learn>=0.21.3 in /usr/local/lib/python3.10/dist-packages (from seqeval) (1.2.2)
Requirement already satisfied: scipy>=1.3.2 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.21.3->seqeval) (1.10.1)
Requirement already satisfied: joblib>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.21.3->seqeval) (1.3.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.21.3->seqeval) (3.2.0)
Building wheels for collected packages: seqeval
  Building wheel for seqeval (setup.py) ... done
  Created wheel for seqeval: filename=seqeval-1.2.2-py3-none-any.whl size=16162 sha256=a3e4deed0ae4f82793ec07d332ea0faca9b72401ac85aa8047235d5fec9ef8ce
  Stored in directory: /root/.cache/pip/wheels/1a/67/4a/ad4082dd7dfc30f2abfe4d80a2ed5926a506eb8a972b4767fa
Successfully built seqeval
Installing collected packages: seqeval
Successfully installed seqeval-1.2.2
Requirement already satisfied: transformers[torch] in /usr/local/lib/python3.10/dist-packages (4.33.0)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (3.12.2)
Requirement already satisfied: huggingface-hub<1.0,>=0.15.1 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (0.16.4)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (1.23.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (23.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (6.0.1)
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (2023.6.3)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (2.31.0)
Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (0.13.3)
Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (0.3.3)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (4.66.1)
Requirement already satisfied: torch!=1.12.0,>=1.10 in /usr/local/lib/python3.10/dist-packages (from transformers[torch]) (2.0.1+cu118)
Collecting accelerate>=0.20.3 (from transformers[torch])
  Downloading accelerate-0.22.0-py3-none-any.whl (251 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 251.2/251.2 kB 5.0 MB/s eta 0:00:00
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.20.3->transformers[torch]) (5.9.5)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.15.1->transformers[torch]) (2023.6.0)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub<1.0,>=0.15.1->transformers[torch]) (4.7.1)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch!=1.12.0,>=1.10->transformers[torch]) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch!=1.12.0,>=1.10->transformers[torch]) (3.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch!=1.12.0,>=1.10->transformers[torch]) (3.1.2)
Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch!=1.12.0,>=1.10->transformers[torch]) (2.0.0)
Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch!=1.12.0,>=1.10->transformers[torch]) (3.27.2)
Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch!=1.12.0,>=1.10->transformers[torch]) (16.0.6)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers[torch]) (3.2.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers[torch]) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers[torch]) (1.26.16)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers[torch]) (2023.7.22)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch!=1.12.0,>=1.10->transformers[torch]) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch!=1.12.0,>=1.10->transformers[torch]) (1.3.0)
Installing collected packages: accelerate
Successfully installed accelerate-0.22.0
Requirement already satisfied: accelerate in /usr/local/lib/python3.10/dist-packages (0.22.0)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from accelerate) (1.23.5)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (23.1)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate) (5.9.5)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate) (6.0.1)
Requirement already satisfied: torch>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from accelerate) (2.0.1+cu118)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.12.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (4.7.1)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (3.1.2)
Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.10.0->accelerate) (2.0.0)
Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.10.0->accelerate) (3.27.2)
Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.10.0->accelerate) (16.0.6)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.10.0->accelerate) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.10.0->accelerate) (1.3.0)

่ฎฉๆˆ‘ไปฌๅฏผๅ…ฅ Argilla ๆจกๅ—ไปฅ่ฟ›่กŒๆ•ฐๆฎ่ฏปๅ–ๅ’Œๅ†™ๅ…ฅ

[2]:
import argilla as rg

ๅฆ‚ๆžœๆ‚จๆญฃๅœจไฝฟ็”จ Docker ๅฟซ้€Ÿๅ…ฅ้—จ้•œๅƒๆˆ– Hugging Face Spaces ่ฟ่กŒ Argilla๏ผŒๅˆ™้œ€่ฆไฝฟ็”จ URL ๅ’Œ API_KEY ๅˆๅง‹ๅŒ– Argilla ๅฎขๆˆท็ซฏ

[3]:
# Replace api_url with the url to your HF Spaces URL if using Spaces
# Replace api_key if you configured a custom API key
# Replace workspace with the name of your workspace
rg.init(
    api_url="https://#:6900",
    api_key="owner.apikey",
    workspace="admin"
)

ๅฆ‚ๆžœๆ‚จๆญฃๅœจ่ฟ่กŒ็งๆœ‰็š„ Hugging Face Space๏ผŒๆ‚จ่ฟ˜้œ€่ฆๆŒ‰ๅฆ‚ไธ‹ๆ–นๅผ่ฎพ็ฝฎ HF_TOKEN

[ ]:
# # Set the HF_TOKEN environment variable
# import os
# os.environ['HF_TOKEN'] = "your-hf-token"

# # Replace api_url with the url to your HF Spaces URL
# # Replace api_key if you configured a custom API key
# # Replace workspace with the name of your workspace
# rg.init(
#     api_url="https://[your-owner-name]-[your_space_name].hf.space",
#     api_key="owner.apikey",
#     workspace="admin",
#     extra_headers={"Authorization": f"Bearer {os.environ['HF_TOKEN']}"},
# )

ๆœ€ๅŽ๏ผŒ่ฎฉๆˆ‘ไปฌๅŒ…ๅซๆˆ‘ไปฌ้œ€่ฆ็š„ๅฏผๅ…ฅ

[4]:
import pandas as pd
import random
import evaluate
import transformers
import numpy as np
import torch
import pickle

from datasets import load_dataset, ClassLabel, Sequence
from argilla.metrics.token_classification import top_k_mentions
from argilla.metrics.token_classification.metrics import Annotations
from IPython.display import display, HTML
from sklearn.model_selection import train_test_split
from transformers import AutoTokenizer, AutoModelForTokenClassification, TrainingArguments, Trainer, DataCollatorForTokenClassification, pipeline

ๅฏ็”จ้ฅๆต‹#

ๆˆ‘ไปฌไปŽๆ‚จไธŽๆˆ‘ไปฌ็š„ๆ•™็จ‹ไบ’ๅŠจ็š„ๆ–นๅผไธญ่Žทๅพ—ๅฎ่ดต็š„่ง่งฃใ€‚ไธบไบ†ๆ”น่ฟ›ๆˆ‘ไปฌ่‡ชๅทฑ๏ผŒไธบๆ‚จๆไพ›ๆœ€ๅˆ้€‚็š„ๅ†…ๅฎน๏ผŒไฝฟ็”จไปฅไธ‹ไปฃ็ ่กŒๅฐ†ๅธฎๅŠฉๆˆ‘ไปฌไบ†่งฃๆœฌๆ•™็จ‹ๆ˜ฏๅฆๆœ‰ๆ•ˆๅœฐไธบๆ‚จๆœๅŠกใ€‚่™ฝ็„ถ่ฟ™ๆ˜ฏๅฎŒๅ…จๅŒฟๅ็š„๏ผŒไฝ†ๅฆ‚ๆžœๆ‚จๆ„ฟๆ„๏ผŒๅฏไปฅ้€‰ๆ‹ฉ่ทณ่ฟ‡ๆญคๆญฅ้ชคใ€‚ๆœ‰ๅ…ณๆ›ดๅคšไฟกๆฏ๏ผŒ่ฏทๆŸฅ็œ‹ ้ฅๆต‹ ้กต้ขใ€‚

[ ]:
try:
    from argilla.utils.telemetry import tutorial_running
    tutorial_running()
except ImportError:
    print("Telemetry is introduced in Argilla 1.20.0 and not found in the current installation. Skipping telemetry.")

๐Ÿš€ ๆŽข็ดขๆˆ‘ไปฌ็š„ๆ•ฐๆฎ้›†#

้ฆ–ๅ…ˆ๏ผŒๆˆ‘ไปฌๅฐ†ไปŽ HuggingFace ๅŠ ่ฝฝๆˆ‘ไปฌๆ•ฐๆฎ้›†็š„่ฎญ็ปƒ้›†๏ผŒไปฅไพฟไฝฟ็”จ load_dataset ๆŽข็ดขๅฎƒใ€‚่€Œไธ”๏ผŒๆญฃๅฆ‚ๆˆ‘ไปฌๆ‰€่ง๏ผŒๅฎƒๆœ‰ 119 ไธชๆก็›ฎๅ’Œไธคๅˆ—๏ผšไธ€ๅˆ—ๆ˜ฏ token ๅบๅˆ—๏ผŒๅฆไธ€ๅˆ—ๆ˜ฏ NER ๆ ‡็ญพๅบๅˆ—ใ€‚

[5]:
dataset = load_dataset("argilla/spacy_sm_wnut17", split = "train")
[6]:
dataset
[6]:
Dataset({
    features: ['tokens', 'ner_tags'],
    num_rows: 119
})

ๆŽฅไธ‹ๆฅ๏ผŒๆˆ‘ไปฌๅฐ†ไฝฟ็”จไปฅไธ‹ไปฃ็ ๏ผŒๅˆฉ็”จ DatasetDict ้€‰้กน Features ๅฐ†ๅ…ถ่ฝฌๆขไธบ Argilla ๆ‰€้œ€็š„ๆ ผๅผไปฅ่ฟ›่กŒๆ—ฅๅฟ—่ฎฐๅฝ•ใ€‚

ๆˆ‘ไปฌ็š„ๆ•ฐๆฎๅฟ…้กปๅ…ทๅค‡็š„ไธ‰ไธช Token ๅˆ†็ฑปๅ…ƒ็ด ๅฆ‚ไธ‹

  • text๏ผšๅฎŒๆ•ด็š„ๅญ—็ฌฆไธฒใ€‚

  • tokens๏ผštoken ๅบๅˆ—ใ€‚

  • annotation๏ผš็”ฑๆ ‡็ญพใ€่ตทๅง‹ไฝ็ฝฎๅ’Œ็ป“ๆŸไฝ็ฝฎ็ป„ๆˆ็š„ๅ…ƒ็ป„ใ€‚

โš ๏ธ ่ฏทๆณจๆ„๏ผš ๆฏๆฌกๆ‰ง่กŒ้ƒฝไผšๅ†ๆฌกไธŠไผ ๅ’ŒๆทปๅŠ ๆ‚จ็š„ๆ ‡ๆณจ๏ผŒ่€Œไธไผš่ขซ่ฆ†็›–ใ€‚

[79]:
# Create a function to read the sequences
def parse_entities(record):
  current_entity = None # to check if current entity in process
  current_info = [] # to save the information used in the tuple for the whole sentence
  char_position = 0
  entities = [] # final list to save the tuples

  # Iterate over the tokens and ner tags
  for i in range(len(record["ner_tags"])):
    token = record["tokens"][i]
    ner_tag = dataset.features["ner_tags"].feature.names[record["ner_tags"][i]]

    if ner_tag.startswith("B-"):
      if current_entity:
        current_info.append(current_entity)
      current_entity = {"word": token, "start": char_position, "tag": ner_tag[2:]}
      char_position += len(token) + 1

    elif ner_tag.startswith("I-"):
        if current_entity:
          current_entity["word"] += " " + token
          char_position += len(token) + 1

    elif ner_tag == "O":
      char_position += len(token) + 1

  # Add the last entity if it exists
  if current_entity:
    current_info.append(current_entity)

  # Calculate the end positions for each entity
  for entity in current_info:
    entity["end"] = entity["start"] + len(entity["word"])

  for entity in current_info:
    entities.append((entity["tag"], entity["start"], entity["end"]))

  return entities
[ ]:
# Write a loop to iterate over each row of your dataset and add the text, tokens, and tuple
records = [
    rg.TokenClassificationRecord(
        text=" ".join(row["tokens"]),
        tokens=row["tokens"],
        annotation=parse_entities(row),
    )
    for row in dataset
]

# Log the records with the name of your choice
rg.log(records, "spacy_sm_wnut17")

็Žฐๅœจๆ‚จๅฐ†่ƒฝๅคŸไปฅๆ›ด็›ด่ง‚็š„ๆ–นๅผๆฃ€ๆŸฅๆ‚จ็š„ๆ ‡ๆณจ๏ผŒ็”š่‡ณๅœจๅฟ…่ฆๆ—ถ็ผ–่พ‘ๅฎƒไปฌใ€‚

argilla-annotations

ๆญคๅค–๏ผŒArgilla ่ฟ˜ๆœ‰ๆ›ดๅคš้€‰้กน๏ผŒไพ‹ๅฆ‚ๆๅ– ๆŒ‡ๆ ‡๏ผŒๅฆ‚ไธ‹ๆ‰€็คบใ€‚

[ ]:
# Select the dataset from Argilla and visualize the data
top_k_mentions(
    name="spacy_sm_wnut17", k=30, threshold=2, compute_for=Annotations
).visualize()

โณ ้ข„ๅค„็†ๆ•ฐๆฎ#

ๆŽฅไธ‹ๆฅ๏ผŒๆˆ‘ไปฌๅฐ†ไปฅๆ‰€้œ€็š„ๆ ผๅผ้ข„ๅค„็†ๆˆ‘ไปฌ็š„ๆ•ฐๆฎ๏ผŒไปฅไพฟๆจกๅž‹ๅฏไปฅไฝฟ็”จๅฎƒใ€‚ๅœจๆˆ‘ไปฌ็š„ไพ‹ๅญไธญ๏ผŒๆˆ‘ไปฌๅฐ†ไปŽ HuggingFace ้‡ๆ–ฐๅŠ ่ฝฝๅฎƒไปฌ๏ผŒๅ› ไธบๅœจ Argilla ไธญๆˆ‘ไปฌๅชๅŠ ่ฝฝไบ†่ฎญ็ปƒ้›†๏ผŒไฝ†ๆ˜ฏ๏ผŒ่ฟ™ไนŸๆ˜ฏๅฏ่ƒฝ็š„ใ€‚

ไปฅไธ‹ไปฃ็ ๅฐ†ๅ…่ฎธๆˆ‘ไปฌไฝฟ็”จ Argilla ๅ‡†ๅค‡ๆˆ‘ไปฌ็š„ๆ•ฐๆฎ๏ผŒ่ฟ™ๅฏนไบŽๆ‰‹ๅŠจๆ ‡ๆณจ็‰นๅˆซๆœ‰็”จ๏ผŒๅ› ไธบๅฎƒไผš่‡ชๅŠจๅฐ† B-๏ผˆๅผ€ๅง‹๏ผ‰ๆˆ– I-๏ผˆๅ†…้ƒจ๏ผ‰ๆทปๅŠ ๅˆฐๆˆ‘ไปฌ็š„ NER ๆ ‡็ญพ๏ผŒๅ…ทไฝ“ๅ–ๅ†ณไบŽๅฎƒไปฌ็š„ไฝ็ฝฎใ€‚

dataset = rg.load("dataset_name").prepare_for_training()

dataset = dataset.train_test_split()

๐Ÿคฏ ๆ็คบ๏ผš ๅœจๆˆ‘ไปฌ็š„ไพ‹ๅญไธญ๏ผŒๆˆ‘ไปฌๆญฃๅœจๅค„็†ไธ€ไธช้žๅธธๅฐ็š„ๆ•ฐๆฎ้›†๏ผŒ่ฏฅๆ•ฐๆฎ้›†ๅˆ†ไธบ่ฎญ็ปƒ้›†ๅ’Œๆต‹่ฏ•้›†ใ€‚ไฝ†ๆ˜ฏ๏ผŒๆ‚จๅฏ่ƒฝๆญฃๅœจไฝฟ็”จๅฆไธ€ไธชๅทฒ็ปๆœ‰ validation ๅˆ†ๅŒบ็š„ๆ•ฐๆฎ้›†๏ผŒๆˆ–่€…ๅณไฝฟๅฎƒๆ›ดๅคง๏ผŒๆ‚จไนŸๅฏไปฅไฝฟ็”จไปฅไธ‹ไปฃ็ ่‡ช่กŒๅˆ›ๅปบๆญคๅˆ†ๅŒบ

dataset['train'], dataset['validation'] = dataset['train'].train_test_split(.1).values()

้‚ฃไนˆ๏ผŒ่ฎฉๆˆ‘ไปฌ็ปง็ปญ๏ผ

[ ]:
dataset = load_dataset("argilla/spacy_sm_wnut17")
print(dataset)
WARNING:datasets.builder:Found cached dataset parquet (/root/.cache/huggingface/datasets/argilla___parquet/argilla--spacy_sm_wnut17-1babd564207f27f8/0.0.0/14a00e99c0d15a23649d0db8944380ac81082d4b021f398733dd84f3a6c569a7)
DatasetDict({
    train: Dataset({
        features: ['tokens', 'ner_tags'],
        num_rows: 119
    })
    test: Dataset({
        features: ['tokens', 'ner_tags'],
        num_rows: 30
    })
})

ๆ˜ฏๆ—ถๅ€™ token ๅŒ– ไบ†๏ผ่™ฝ็„ถ็œ‹่ตทๆฅ่ฟ™ๅทฒ็ปๅฎŒๆˆไบ†๏ผŒไฝ†ๆฏไธช token ไป็„ถ้œ€่ฆ่ฝฌๆขไธบๅ‘้‡๏ผˆID๏ผ‰๏ผŒๆจกๅž‹ๅฏไปฅไปŽๅ…ถ้ข„่ฎญ็ปƒ็š„่ฏๆฑ‡่กจไธญ่ฏปๅ–่ฏฅๅ‘้‡ใ€‚ไธบๆญค๏ผŒๆˆ‘ไปฌๅฐ†ไฝฟ็”จ AutoTokenizer.from_pretrained ๅ’Œ FastTokenizer distilbert-base-uncasedใ€‚

[ ]:
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
assert isinstance(tokenizer, transformers.PreTrainedTokenizerFast)
[ ]:
# Example of original tokens
example = dataset["train"][0]
print(example["tokens"])

# Example after executing the AutoTokenizer
tokenized_input = tokenizer(example["tokens"], is_split_into_words=True)
tokens = tokenizer.convert_ids_to_tokens(tokenized_input["input_ids"])
print(tokens)
['says', 'it', "'s", 'Saturday', '!', 'I', "'m", 'wearing', 'my', 'Weekend', '!', ':)']
['[CLS]', 'says', 'it', "'", 's', 'saturday', '!', 'i', "'", 'm', 'wearing', 'my', 'weekend', '!', ':', ')', '[SEP]']

ไฝ†ๆ˜ฏ๏ผŒๆˆ‘ไปฌ็Žฐๅœจ้‡ๅˆฐไบ†ไธ€ไธชๆ–ฐ้—ฎ้ข˜ใ€‚็”ฑไบŽๅฎƒๆ นๆฎ้ข„่ฎญ็ปƒ็š„่ฏๆฑ‡่กจ่ฟ›่กŒ token ๅŒ–๏ผŒ่ฟ™ๅฐ†ๅœจๆŸไบ›ๅ•่ฏไธญๅˆ›ๅปบๆ–ฐ็š„็ป†ๅˆ†๏ผˆไพ‹ๅฆ‚๏ผŒโ€œโ€™โ€ ๅ’Œ โ€œsโ€ ไธญ็š„ โ€œโ€™sโ€๏ผ‰ใ€‚ๆญคๅค–๏ผŒๅฎƒ่ฟ˜ๆทปๅŠ ไบ†ไธคไธชๆ–ฐๆ ‡็ญพ [CLS] ๅ’Œ [SEP]ใ€‚ๅ› ๆญค๏ผŒๆˆ‘ไปฌๅฟ…้กปๅ€ŸๅŠฉ word-ids ๆ–นๆณ•ๅฐ†ๅ•่ฏ็š„ ID ไธŽ็›ธๅบ”็š„ NER ๆ ‡็ญพ้‡ๆ–ฐๅฏน้ฝใ€‚

[ ]:
label_all_tokens = True

def tokenize_and_align_labels(examples):
    tokenized_inputs = tokenizer(examples["tokens"], truncation=True, is_split_into_words=True)

    labels = []
    for i, label in enumerate(examples["ner_tags"]):
        word_ids = tokenized_inputs.word_ids(batch_index=i)
        previous_word_idx = None
        label_ids = []
        for word_idx in word_ids:
            # Special tokens have a word id that is None. We set the label to -100 so they are automatically
            # ignored in the loss function.
            if word_idx is None:
                label_ids.append(-100)
            # We set the label for the first token of each word.
            elif word_idx != previous_word_idx:
                label_ids.append(label[word_idx])
            # For the other tokens in a word, we set the label to either the current label or -100, depending on
            # the label_all_tokens flag.
            else:
                label_ids.append(label[word_idx] if label_all_tokens else -100)
            previous_word_idx = word_idx

        labels.append(label_ids)

    tokenized_inputs["labels"] = labels
    return tokenized_inputs

tokenized_dataset = dataset.map(tokenize_and_align_labels, batched=True)
WARNING:datasets.arrow_dataset:Loading cached processed dataset at /root/.cache/huggingface/datasets/argilla___parquet/argilla--spacy_sm_wnut17-1babd564207f27f8/0.0.0/14a00e99c0d15a23649d0db8944380ac81082d4b021f398733dd84f3a6c569a7/cache-55b667584ffacf49.arrow

๐Ÿ” ๅพฎ่ฐƒๆจกๅž‹#

ๆˆ‘ไปฌ็Žฐๅœจๅบ”่ฏฅๅผ€ๅง‹ๅ‡†ๅค‡ๆจกๅž‹็š„ๅ‚ๆ•ฐ๏ผŒๅณๆˆ‘ไปฌๅบ”่ฏฅๅผ€ๅง‹ๅพฎ่ฐƒใ€‚

ๆจกๅž‹#

้ฆ–ๅ…ˆ๏ผŒๆˆ‘ไปฌๅฐ†ไฝฟ็”จ AutoModelForTokenClassification ไธ‹่ฝฝๆˆ‘ไปฌ็š„้ข„่ฎญ็ปƒๆจกๅž‹๏ผŒๆˆ‘ไปฌๅฐ†ๆŒ‡็คบๆ‰€้€‰ๆจกๅž‹็š„ๅ็งฐใ€ๆ ‡็ญพๆ•ฐ้‡๏ผŒๅนถๆŒ‡็คบๅ…ถ ID ๅ’Œๅ็งฐไน‹้—ด็š„ๅฏนๅบ”ๅ…ณ็ณปใ€‚

ๆญคๅค–๏ผŒๆˆ‘ไปฌ่ฟ˜ๅฐ†่ฎพ็ฝฎๆˆ‘ไปฌ็š„ DataCollator๏ผŒ้€š่ฟ‡ไฝฟ็”จๆˆ‘ไปฌๅค„็†่ฟ‡็š„็คบไพ‹ไฝœไธบ่พ“ๅ…ฅๆฅๅฝขๆˆๆ‰นๆฌกใ€‚ๅœจ่ฟ™็งๆƒ…ๅ†ตไธ‹๏ผŒๆˆ‘ไปฌๅฐ†ไฝฟ็”จ DataCollatorForTokenClassificationใ€‚

[ ]:
# Create a dictionary with the ids and the relevant label.
label_list = dataset["train"].features["ner_tags"].feature.names
id2label = {i: label for i, label in enumerate(label_list)}
label2id = {v: k for k, v in id2label.items()}

# Download the model.
model = AutoModelForTokenClassification.from_pretrained("distilbert-base-uncased", num_labels=len(label_list), id2label=id2label, label2id=label2id)

# Set the DataCollator
data_collator = DataCollatorForTokenClassification(tokenizer)
Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForTokenClassification: ['vocab_transform.bias', 'vocab_projector.bias', 'vocab_layer_norm.weight', 'vocab_layer_norm.bias', 'vocab_transform.weight']
- This IS expected if you are initializing DistilBertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DistilBertForTokenClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

่ฎญ็ปƒๅ‚ๆ•ฐ#

TrainingArguments ็ฑปๅฐ†ๅŒ…ๅซ ๅ‚ๆ•ฐ ไปฅ่‡ชๅฎšไน‰ๆˆ‘ไปฌ็š„่ฎญ็ปƒใ€‚

๐Ÿ’ก ๆ็คบ๏ผš ๅฆ‚ๆžœๆ‚จๆญฃๅœจไฝฟ็”จ HuggingFace๏ผŒ้‚ฃไนˆ็›ดๆŽฅๅœจ้‚ฃ้‡Œไฟๅญ˜ๆ‚จ็š„ๆจกๅž‹ๅฏ่ƒฝไผšๆ›ดๅฎนๆ˜“ใ€‚ไธบๆญค๏ผŒ่ฏทไฝฟ็”จไปฅไธ‹ไปฃ็ ๅนถๅฐ†ไปฅไธ‹ๅ‚ๆ•ฐๆทปๅŠ ๅˆฐ TrainingArgumentsใ€‚

from huggingface_hub import notebook_login
notebook_login()

# Add the following parameter
training_args = TrainingArguments(
    save_strategy="epoch",
    load_best_model_at_end=True,
    push_to_hub=True,
)

๐Ÿ•น๏ธ ่ฎฉๆˆ‘ไปฌ็Žฉ็Žฉ๏ผš ๆ‚จๅฏไปฅ่Žทๅพ—็š„ๆœ€ไฝณๅ‡†็กฎ็އๆ˜ฏๅคšๅฐ‘๏ผŸ

[ ]:
training_args = TrainingArguments(
    output_dir="ner-recognition",
    learning_rate=2e-4,
    per_device_train_batch_size=32,
    per_device_eval_batch_size=32,
    num_train_epochs=20,
    weight_decay=0.05,
    evaluation_strategy="epoch",
    optim="adamw_torch",
    logging_steps = 50
)

ๆŒ‡ๆ ‡#

่ฆไบ†่งฃๆˆ‘ไปฌ็š„่ฎญ็ปƒ่ฟ›ๅฑ•ๅฆ‚ไฝ•๏ผŒๅฝ“็„ถ๏ผŒๆˆ‘ไปฌๅฟ…้กปไฝฟ็”จๆŒ‡ๆ ‡ใ€‚ๅ› ๆญค๏ผŒๆˆ‘ไปฌๅฐ†ไฝฟ็”จ Seqeval ๅ’Œไธ€ไธชๅ‡ฝๆ•ฐ๏ผŒ่ฏฅๅ‡ฝๆ•ฐไปŽๅฎž้™…ๆ ‡็ญพๅ’Œ้ข„ๆต‹ๆ ‡็ญพ่ฎก็ฎ—็ฒพ็กฎ็އใ€ๅฌๅ›ž็އใ€F1 ๅ’Œๅ‡†็กฎ็އใ€‚

[ ]:
# Load Sqeval.
metric = evaluate.load("seqeval")

# Create the list with the tags.
labels = [label_list[i] for i in example[f"ner_tags"]]

# Function to compute precision, recall, F1 and accuracy.
def compute_metrics(p):
    predictions, labels = p
    predictions = np.argmax(predictions, axis=2)

    true_predictions = [
        [label_list[p] for (p, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(predictions, labels)
    ]
    true_labels = [
        [label_list[l] for (p, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(predictions, labels)
    ]

    results = metric.compute(predictions=true_predictions, references=true_labels)
    return {
        "precision": results["overall_precision"],
        "recall": results["overall_recall"],
        "f1": results["overall_f1"],
        "accuracy": results["overall_accuracy"],
    }

่ฎญ็ปƒๆ—ถ้—ดๅˆฐไบ†#

้กพๅๆ€ไน‰๏ผŒ็Žฐๅœจๆ˜ฏๅฐ†ๆ‰€ๆœ‰ๅ…ˆๅ‰็š„ๅ…ƒ็ด ็ป„ๅˆๅœจไธ€่ตทๅนถๅผ€ๅง‹ไฝฟ็”จ Trainer ่ฟ›่กŒ่ฎญ็ปƒ็š„ๆ—ถๅ€™ไบ†ใ€‚

[ ]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
    data_collator=data_collator,
    compute_metrics=compute_metrics,
)

# Train.
trainer.train()
You're using a DistilBertTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
[80/80 00:12, Epoch 20/20]
Epoch ่ฎญ็ปƒๆŸๅคฑ ้ชŒ่ฏๆŸๅคฑ ็ฒพ็กฎ็އ ๅฌๅ›ž็އ F1 ๅ‡†็กฎ็އ
1 ๆ— ๆ—ฅๅฟ— 1.445835 0.000000 0.000000 0.000000 0.720751
2 ๆ— ๆ—ฅๅฟ— 1.540381 0.000000 0.000000 0.000000 0.720751
3 ๆ— ๆ—ฅๅฟ— 1.300941 0.000000 0.000000 0.000000 0.720751
4 ๆ— ๆ—ฅๅฟ— 1.259119 0.000000 0.000000 0.000000 0.720751
5 ๆ— ๆ—ฅๅฟ— 1.256542 0.444444 0.025478 0.048193 0.720751
6 ๆ— ๆ—ฅๅฟ— 1.154050 0.202703 0.095541 0.129870 0.736203
7 ๆ— ๆ—ฅๅฟ— 1.388463 0.254545 0.089172 0.132075 0.718543
8 ๆ— ๆ—ฅๅฟ— 1.246235 0.275362 0.121019 0.168142 0.737307
9 ๆ— ๆ—ฅๅฟ— 1.254787 0.202020 0.127389 0.156250 0.731788
10 ๆ— ๆ—ฅๅฟ— 1.388549 0.272727 0.171975 0.210938 0.735099
11 ๆ— ๆ—ฅๅฟ— 1.494627 0.297619 0.159236 0.207469 0.740618
12 ๆ— ๆ—ฅๅฟ— 1.331303 0.232558 0.191083 0.209790 0.746137
13 0.675300 1.473191 0.252252 0.178344 0.208955 0.748344
14 0.675300 1.566783 0.275510 0.171975 0.211765 0.742826
15 0.675300 1.500171 0.252336 0.171975 0.204545 0.739514
16 0.675300 1.541946 0.274336 0.197452 0.229630 0.742826
17 0.675300 1.546347 0.258333 0.197452 0.223827 0.745033
18 0.675300 1.534100 0.271186 0.203822 0.232727 0.743929
19 0.675300 1.535095 0.277311 0.210191 0.239130 0.745033
20 0.675300 1.539303 0.277311 0.210191 0.239130 0.745033

/usr/local/lib/python3.10/dist-packages/seqeval/metrics/v1.py:57: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
/usr/local/lib/python3.10/dist-packages/seqeval/metrics/v1.py:57: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 due to no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
TrainOutput(global_step=80, training_loss=0.45428856909275056, metrics={'train_runtime': 14.9864, 'train_samples_per_second': 158.811, 'train_steps_per_second': 5.338, 'total_flos': 32769159790410.0, 'train_loss': 0.45428856909275056, 'epoch': 20.0})

evaluate ๆ–นๆณ•ๅฐ†ๅ…่ฎธๆ‚จๅ†ๆฌกๅœจ้ชŒ่ฏ้›†ๆˆ–ๅฆไธ€ไธชๆ•ฐๆฎ้›†ไธŠ่ฟ›่กŒ่ฏ„ไผฐ๏ผˆไพ‹ๅฆ‚๏ผŒๅฆ‚ๆžœๆ‚จๆœ‰่ฎญ็ปƒ้›†ใ€้ชŒ่ฏ้›†ๅ’Œๆต‹่ฏ•้›†๏ผ‰ใ€‚

[ ]:
trainer.evaluate()
[1/1 : < :]
{'eval_loss': 1.5393034219741821,
 'eval_precision': 0.2773109243697479,
 'eval_recall': 0.21019108280254778,
 'eval_f1': 0.2391304347826087,
 'eval_accuracy': 0.7450331125827815,
 'eval_runtime': 0.0918,
 'eval_samples_per_second': 326.934,
 'eval_steps_per_second': 10.898,
 'epoch': 20.0}

๐Ÿ”ฎ ๅฐ่ฏ•้ข„ๆต‹

ๅฝ“ๆ‚จๅˆ›ๅปบๆจกๅž‹ๅนถๅฏนๅ…ถๆ„Ÿๅˆฐๆปกๆ„ๆ—ถ๏ผŒ่ฏทไฝฟ็”จๆ‚จ่‡ชๅทฑ็š„ๆ–‡ๆœฌ่‡ช่กŒๆต‹่ฏ•ใ€‚

# Replace this with the directory where it was saved
model_checkpoint = "your-path"
token_classifier = pipeline("token-classification", model=model_checkpoint, aggregation_strategy="simple")
token_classifier("I heard Madrid is wonderful in spring.")

๐Ÿ“โœ”๏ธ ๆ€ป็ป“#

ๅœจๆœฌๆ•™็จ‹ไธญ๏ผŒๆˆ‘ไปฌๅญฆไน ไบ†ๅฆ‚ไฝ•ๅฐ†ๆˆ‘ไปฌ็š„่ฎญ็ปƒๆ•ฐๆฎ้›†ไธŠไผ ๅˆฐ Argilla๏ผŒไปฅไพฟๅฏ่ง†ๅŒ–ๅฎƒๅŒ…ๅซ็š„ๆ•ฐๆฎๅŠๅ…ถไฝฟ็”จ็š„ NER ๆ ‡็ญพ๏ผŒไปฅๅŠๅฆ‚ไฝ•ไฝฟ็”จ transformers ไธบ NER ่ฏ†ๅˆซๅพฎ่ฐƒ BERT ๆจกๅž‹ใ€‚่ฟ™ๅฏนไบŽๅญฆไน  BERT ้ข„ๆจกๅž‹็š„ๅŸบ็คŽ็Ÿฅ่ฏ†้žๅธธๆœ‰็”จ๏ผŒๅนถไปŽ้‚ฃ้‡Œ่ฟ›ไธ€ๆญฅๅ‘ๅฑ•ๆ‚จ็š„ๆŠ€่ƒฝๅนถๅฐ่ฏ•ๅฏ่ƒฝไบง็”Ÿๆ›ดๅฅฝ็ป“ๆžœ็š„ไธๅŒๆจกๅž‹ใ€‚

๐Ÿ’ชๅนฒๆฏ๏ผ