Browsing by Subject "Wikipedia"

Sort by: Order: Results:

Now showing items 1-4 of 4
  • Ruokolainen, Teemu; Kauppinen, Pekka; Silfverberg, Miikka; Lindén, Krister (2020)
    We present a corpus of Finnish news articles with a manually prepared named entity annotation. The corpus consists of 953 articles (193,742 word tokens) with six named entity classes (organization, location, person, product, event, and date). The articles are extracted from the archives of Digitoday, a Finnish online technology news source. The corpus is available for research purposes. We present baseline experiments on the corpus using a rule-based and two deep learning systems on two, in-domain and out-of-domain, test sets.
  • Mittermeier, John C.; Roll, Uri; Matthews, Thomas J.; Correia, Ricardo A.; Grenyer, Richard (2021)
    Large body size, the defining characteristic of "charismatic megafauna," is often viewed as the most significant correlate of higher public interest in species. However, common, local species (many of which are not large) can also generate public interest. We explored the relative importance of body size versus local occurrence in patterns of online interest in birds using a large sample of digital human-wildlife interactions (367 million Wikipedia pageviews) that included more than 10,000 bird species and a range of cultural and geographic contexts (represented by 25 Wikipedia language editions). We compared interest in Wikipedia, as measured by pageviews, with a bird's body size and its regional observation frequency (using data from ). We found that local species (i.e., those that occur in the wild in the country responsible for the majority of a Wikipedia language edition's pageviews) attract more pageviews than global species. Both body size and observation frequency had a positive correlation with Wikipedia pageviews across languages, but eBird observation frequency explained more of the variance in pageviews on average. In a model that included both observation frequency and body size, observation frequency was a significantly better predictor of pageviews than body size in 24 of 25 languages. Our results demonstrate that the opportunity to encounter birds in the wild is a significant correlate of increased online interest in birds across multiple linguistic and geographic contexts. This relationship provides insight into why some species attract greater interest than others and emphasizes the overlooked potential of common species in conservation marketing.
  • Aaltonen, Aleksi (2011)
    In this paper we propose a theoretical framework to understand the governance of internet-mediated social production. Focusing on one of the most popular websites and reference tools, Wikipedia, we undertake an exploratory theoretical analysis to clarify the structure and mechanisms driving the endogenous change of a large-scale social production system. We argue that the popular transactions costs approach underpinning many of the analyses is an insufficient framework for unpacking the evolutionary character of governance. The evolution of Wikipedia and its shifting modes of governance can be better framed as a process of building a collective capability, namely the capability of editing and managing a new kind of encyclopedia. We understand Wikipedia evolution as a learning phenomenon that gives over time rise to governance mechanisms and structures as endogenous responses to the problems and conditions that the ongoing development of Wikipedia itself has produced over the years. Finally, we put forward five empirical hypotheses to test the theoretical framework.
  • Mittermeier, John; Correia, Ricardo A.; Grenyer, Richard; Toivonen, Tuuli; Roll, Uri (2021)
    The recent growth of online big data offers opportunities for rapid and inexpensive measurement of public interest. Conservation culturomics is an emerging research area that uses online data to study human-nature relationships for conservation. Methods for conservation culturomics, though promising, are still being developed and refined. We considered the potential of Wikipedia, the online encyclopedia, as a resource for conservation culturomics and outlined methods for using Wikipedia data in conservation. Wikipedia's large size, widespread use, underlying data structure, and open access to both its content and usage analytics make it well suited to conservation culturomics research. Limitations of Wikipedia data include the lack of location information associated with some metadata and limited information on the motivations of many users. Seven methodological steps to consider when using Wikipedia data in conservation include metadata selection, temporality, taxonomy, language representation, Wikipedia geography, physical and biological geography, and comparative metrics. Each of these methodological decisions can affect measures of online interest. As a case study, we explored these themes by analyzing 757 million Wikipedia page views associated with the Wikipedia pages for 10,099 species of birds across 251 Wikipedia language editions. We found that Wikipedia data have the potential to generate insight for conservation and are particularly useful for quantifying patterns of public interest at large scales.