Month: September 2022

2022 September status on Danish lexemes in Wikidata

Posted on Updated on

Some statistics:

Interesting individual lexemes:

  • rød, – a word with many compounds specified.
  • værdipapirfinansieringstransaktionseksponering, longest Danish lexeme attested in Den Europæiske Centralbanks udtalelse af 8. november 2017 om ændringer til EU’s ramme for kapitalkrav til kreditinstitutter og investeringsselskaber
  • koronavirus, – a lexeme with many different alternative forms, due to two variations: korona/corona, virusser/virus/vira.
  • led (representation), – a word with many homographs

Images from Wikimedia Commons. Licenses available from links at https://ordia.toolforge.org/L2310.

Types:

Descriptions, use cases for instance:

Alignment with COR

I am working on examining the alignment between

  • Missing genitive on nouns and numerals in Wikidata.
  • øl (øllen/øllet), plan (planen/planet): They are separate in COR but one lexeme in Den Danske Ordbog and currently in Wikidata.
  • Adverbs from adjectives. They are under adjectives in COR.
  • Adjectives from verbs (perfektum participium), e.g., ubekymret (in COR as an adjective), bekymret (not in COR as an adjective)
  • Various schemes for adjectives, e.g., -sk adjectives do not have differences in grammatical number.

See also the earlier posts Linking from Danish Wikidata lexemes to COR and Part-of-speech tags in Det Centrale Ordregister.

Ordia

Ordia is a web application for Wikidata lexemes.

Tools in the application:

The tool has apparent around 200 pageviews per day: