Bootstrapping yahoo! Finance by wikipedia for competitor mining

Tong Ruan*, Lijuan Xue, Haofen Wang, Jeff Z. Pan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingPublished conference contribution

3 Citations (Scopus)


Competitive intelligence, one of the key factors of enterprise risk management and decision support, depends on knowledge bases that contain a large amount of competitive information. A variety of finance websites have collected competitive information manually, which can be used as knowledge bases. Yahoo! Finance is one of the largest and most successful finance websites among them. However, they have problems of incompleteness, lack of competitive domain, and not-in-time updating. Wikipedia, which was built with collective wisdom and contains plenty of useful information in various forms, can solve the above-mentioned problems effectively, thus helping build a more comprehensive knowledge base. In this paper, we propose a novel semi-supervised approach to identify competitor information and competitive domain from Wikipedia based on a multi-strategy learning algorithm. More precisely, we leverage seeds of competition between companies and competition between products to distantly supervise the learning process to find text patterns in free texts. Considering that competitive information can be inferred from events, we design a learning-based method to determine event description sentences. The whole process is iteratively performed. The experimental results show the effectiveness of our approach. Moreover, the results extracted from Wikipedia supplement 14,000 competitor pairs and 8,000 competitive domains between rival companies to Yahoo! Finance.

Original languageEnglish
Title of host publicationSemantic Technology
Subtitle of host publication5th Joint International Conference, JIST 2015, Yichang, China, November 11-13, 2015, Revised Selected Papers
EditorsGuilin Qi, Kouji Kozaki, Jeff Z. Pan, Siwei Yu
Number of pages19
ISBN (Print)9783319316758
Publication statusPublished - 2016
Event5th Joint International Conference on Semantic Technology, JIST 2015 - Yichang, China
Duration: 11 Nov 201513 Nov 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN (Print)03029743
ISSN (Electronic)16113349


Conference5th Joint International Conference on Semantic Technology, JIST 2015


  • Competitor mining
  • Distant supervision
  • Multi-strategy learning
  • Relation reasoning


Dive into the research topics of 'Bootstrapping yahoo! Finance by wikipedia for competitor mining'. Together they form a unique fingerprint.

Cite this