Search results for: mining-latent-entity-structures

Mining Latent Entity Structures

Author : Chi Wang
File Size : 37.23 MB
Format : PDF, ePub, Docs
Download : 976
Read : 1139
Download »
The "big data" era is characterized by an explosion of information in the form of digital data collections, ranging from scientific knowledge, to social media, news, and everyone's daily life. Examples of such collections include scientific publications, enterprise logs, news articles, social media, and general web pages. Valuable knowledge about multi-typed entities is often hidden in the unstructured or loosely structured, interconnected data. Mining latent structures around entities uncovers hidden knowledge such as implicit topics, phrases, entity roles and relationships. In this monograph, we investigate the principles and methodologies of mining latent entity structures from massive unstructured and interconnected data. We propose a text-rich information network model for modeling data in many different domains. This leads to a series of new principles and powerful methodologies for mining latent structures, including (1) latent topical hierarchy, (2) quality topical phrases, (3) entity roles in hierarchical topical communities, and (4) entity relations. This book also introduces applications enabled by the mined structures and points out some promising research directions.

Mining Structures of Factual Knowledge from Text

Author : Xiang Ren
File Size : 37.89 MB
Format : PDF, ePub, Docs
Download : 235
Read : 1273
Download »
The real-world data, though massive, is largely unstructured, in the form of natural-language text. It is challenging but highly desirable to mine structures from massive text data, without extensive human annotation and labeling. In this book, we investigate the principles and methodologies of mining structures of factual knowledge (e.g., entities and their relationships) from massive, unstructured text corpora. Departing from many existing structure extraction methods that have heavy reliance on human annotated data for model training, our effort-light approach leverages human-curated facts stored in external knowledge bases as distant supervision and exploits rich data redundancy in large text corpora for context understanding. This effort-light mining approach leads to a series of new principles and powerful methodologies for structuring text corpora, including (1) entity recognition, typing and synonym discovery, (2) entity relation extraction, and (3) open-domain attribute-value mining and information extraction. This book introduces this new research frontier and points out some promising research directions.

Individual and Collective Graph Mining

Author : Danai Koutra
File Size : 72.91 MB
Format : PDF
Download : 154
Read : 1318
Download »
Graphs naturally represent information ranging from links between web pages, to communication in email networks, to connections between neurons in our brains. These graphs often span billions of nodes and interactions between them. Within this deluge of interconnected data, how can we find the most important structures and summarize them? How can we efficiently visualize them? How can we detect anomalies that indicate critical events, such as an attack on a computer system, disease formation in the human brain, or the fall of a company? This book presents scalable, principled discovery algorithms that combine globality with locality to make sense of one or more graphs. In addition to fast algorithmic methodologies, we also contribute graph-theoretical ideas and models, and real-world applications in two main areas: •Individual Graph Mining: We show how to interpretably summarize a single graph by identifying its important graph structures. We complement summarization with inference, which leverages information about few entities (obtained via summarization or other methods) and the network structure to efficiently and effectively learn information about the unknown entities. •Collective Graph Mining: We extend the idea of individual-graph summarization to time-evolving graphs, and show how to scalably discover temporal patterns. Apart from summarization, we claim that graph similarity is often the underlying problem in a host of applications where multiple graphs occur (e.g., temporal anomaly detection, discovery of behavioral patterns), and we present principled, scalable algorithms for aligning networks and measuring their similarity. The methods that we present in this book leverage techniques from diverse areas, such as matrix algebra, graph theory, optimization, information theory, machine learning, finance, and social science, to solve real-world problems. We present applications of our exploration algorithms to massive datasets, including a Web graph of 6.6 billion edges, a Twitter graph of 1.8 billion edges, brain graphs with up to 90 million edges, collaboration, peer-to-peer networks, browser logs, all spanning millions of users and interactions.

Phrase Mining from Massive Text and Its Applications

Author : Jialu Liu
File Size : 26.95 MB
Format : PDF, ePub
Download : 736
Read : 1282
Download »
A lot of digital ink has been spilled on "big data" over the past few years. Most of this surge owes its origin to the various types of unstructured data in the wild, among which the proliferation of text-heavy data is particularly overwhelming, attributed to the daily use of web documents, business reviews, news, social posts, etc., by so many people worldwide.A core challenge presents itself: How can one efficiently and effectively turn massive, unstructured text into structured representation so as to further lay the foundation for many other downstream text mining applications? In this book, we investigated one promising paradigm for representing unstructured text, that is, through automatically identifying high-quality phrases from innumerable documents. In contrast to a list of frequent n-grams without proper filtering, users are often more interested in results based on variable-length phrases with certain semantics such as scientific concepts, organizations, slogans, and so on. We propose new principles and powerful methodologies to achieve this goal, from the scenario where a user can provide meaningful guidance to a fully automated setting through distant learning. This book also introduces applications enabled by the mined phrases and points out some promising research directions.

Mining Human Mobility in Location Based Social Networks

Author : Huiji Gao
File Size : 40.81 MB
Format : PDF, Mobi
Download : 994
Read : 232
Download »
In recent years, there has been a rapid growth of location-based social networking services, such as Foursquare and Facebook Places, which have attracted an increasing number of users and greatly enriched their urban experience. Typical location-based social networking sites allow a user to "check in" at a real-world POI (point of interest, e.g., a hotel, restaurant, theater, etc.), leave tips toward the POI, and share the check-in with their online friends. The check-in action bridges the gap between real world and online social networks, resulting in a new type of social networks, namely location-based social networks (LBSNs). Compared to traditional GPS data, location-based social networks data contains unique properties with abundant heterogeneous information to reveal human mobility, i.e., "when and where a user (who) has been to for what," corresponding to an unprecedented opportunity to better understand human mobility from spatial, temporal, social, and content aspects. The mining and understanding of human mobility can further lead to effective approaches to improve current location-based services from mobile marketing to recommender systems, providing users more convenient life experience than before. This book takes a data mining perspective to offer an overview of studying human mobility in location-based social networks and illuminate a wide range of related computational tasks. It introduces basic concepts, elaborates associated challenges, reviews state-of-the-art algorithms with illustrative examples and real-world LBSN datasets, and discusses effective evaluation methods in mining human mobility. In particular, we illustrate unique characteristics and research opportunities of LBSN data, present representative tasks of mining human mobility on location-based social networks, including capturing user mobility patterns to understand when and where a user commonly goes (location prediction), and exploiting user preferences and location profiles to investigate where and when a user wants to explore (location recommendation), along with studying a user's check-in activity in terms of why a user goes to a certain location.

Data Mining for Business Applications

Author : Longbing Cao
File Size : 86.69 MB
Format : PDF, ePub
Download : 210
Read : 242
Download »
Data Mining for Business Applications presents the state-of-the-art research and development outcomes on methodologies, techniques, approaches and successful applications in the area. The contributions mark a paradigm shift from “data-centered pattern mining” to “domain driven actionable knowledge discovery” for next-generation KDD research and applications. The contents identify how KDD techniques can better contribute to critical domain problems in theory and practice, and strengthen business intelligence in complex enterprise applications. The volume also explores challenges and directions for future research and development in the dialogue between academia and business.

Mining Text Data

Author : Charu C. Aggarwal
File Size : 67.5 MB
Format : PDF
Download : 349
Read : 339
Download »
Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. Mining Text Data introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including the key research content on the topic, and the future directions of research in the field. There is a special focus on Text Embedded with Heterogeneous and Multimedia Data which makes the mining process much more challenging. A number of methods have been designed such as transfer learning and cross-lingual mining for such cases. Mining Text Data simplifies the content, so that advanced-level students, practitioners and researchers in computer science can benefit from this book. Academic and corporate libraries, as well as ACM, IEEE, and Management Science focused on information security, electronic commerce, databases, data mining, machine learning, and statistics are the primary buyers for this reference book.

Handbook of Research on Text and Web Mining Technologies

Author : Song, Min
File Size : 51.63 MB
Format : PDF, Kindle
Download : 128
Read : 417
Download »
Examines recent advances and surveys of applications in text and web mining which should be of interest to researchers and end-users alike.

Natural Language Processing and Text Mining

Author : Anne Kao
File Size : 81.54 MB
Format : PDF, ePub
Download : 395
Read : 1295
Download »
Natural Language Processing and Text Mining not only discusses applications of Natural Language Processing techniques to certain Text Mining tasks, but also the converse, the use of Text Mining to assist NLP. It assembles a diverse views from internationally recognized researchers and emphasizes caveats in the attempt to apply Natural Language Processing to text mining. This state-of-the-art survey is a must-have for advanced students, professionals, and researchers.

Social and Political Implications of Data Mining Knowledge Management in E Government

Author : Rahman, Hakikur
File Size : 82.5 MB
Format : PDF
Download : 379
Read : 631
Download »
"This book focuses on the data mining and knowledge management implications that lie within online government"--Provided by publisher.

Proceedings of the Seventh SIAM International Conference on Data Mining

Author : Chid Apte
File Size : 33.36 MB
Format : PDF, Mobi
Download : 403
Read : 1225
Download »
The Seventh SIAM International Conference on Data Mining (SDM 2007) continues a series of conferences whose focus is the theory and application of data mining to complex datasets in science, engineering, biomedicine, and the social sciences. These datasets challenge our abilities to analyze them because they are large and often noisy. Sophisticated, highperformance, and principled analysis techniques and algorithms, based on sound statistical foundations, are required. Visualization is often critically important; tuning for performance is a significant challenge; and the appropriate levels of abstraction to allow end-users to exploit sophisticated techniques and understand clearly both the constraints and interpretation of results are still something of an open question.

Text Mining Application Programming

Author : Manu Konchady
File Size : 23.49 MB
Format : PDF, Mobi
Download : 246
Read : 1006
Download »
Text Mining Application Programming teaches software developers how to mine the vast amounts of information available on the Web, internal networks, and desktop files and turn it into usable data. The book helps developers understand the problems associated with managing unstructured text, and explains how to build your own mining tools using standard statistical methods from information theory, artificial intelligence, and operations research. Each of the topics covered are thoroughly explained and then a practical implementation is provided.The book begins with a brief overview of text data, where it can be found, and the typical search engines and tools used to search and gather this text. It details how to build tools for extracting and using the text, and covers the mathematics behind many of the algorithms used in building these tools. From there you'll learn how to build tokens from text, construct indexes, and detect patterns in text. You'll also find methods to extract the names of people, places, and organizations from an email, a news article, or a Web page. The next portion of the book teaches you how to find information on the Web, the structure of the Web, and how to build spiders to crawl the Web. Text categorization is also described in the context of managing email. The final part of the book covers information monitoring, summarization, and a simple Question & Answer (Q&A) system. The code used in the book is written in Perl, but knowledge of Perl is not necessary to run the software. Developers with an intermediate level of experience with Perl can customize the software. Although the book is about programming, methods are explained with English-like pseudocode and the source code is provided on the CD-ROM. After reading this book, you'll be ready to tap into the bevy of information available online in ways you never thought possible.

Data Mining in Biomedicine

Author : Panos M. Pardalos
File Size : 25.8 MB
Format : PDF, ePub, Docs
Download : 511
Read : 1306
Download »
This volume presents an extensive collection of contributions covering aspects of the exciting and important research field of data mining techniques in biomedicine. Coverage includes new approaches for the analysis of biomedical data; applications of data mining techniques to real-life problems in medical practice; comprehensive reviews of recent trends in the field. The book addresses incorporation of data mining in fundamental areas of biomedical research: genomics, proteomics, protein characterization, and neuroscience.

Mobility Data Mining and Privacy

Author : Fosca Giannotti
File Size : 36.67 MB
Format : PDF, ePub, Mobi
Download : 691
Read : 1272
Download »
Mobile communications and ubiquitous computing generate large volumes of data. Mining this data can produce useful knowledge, yet individual privacy is at risk. This book investigates the various scientific and technological issues of mobility data, open problems, and roadmap. The editors manage a research project called GeoPKDD, Geographic Privacy-Aware Knowledge Discovery and Delivery, and this book relates their findings in 13 chapters covering all related subjects.

Mining the Biomedical Literature

Author : Hagit Shatkay
File Size : 87.95 MB
Format : PDF, ePub, Mobi
Download : 502
Read : 230
Download »
The authors offer an accessible introduction to key ideas in biomedical text mining. The chapters cover such topics as the sources of biomedical text; text-analysis methods in natural language processing; the tasks of information extraction, information retrieval, and text categorization; and methods for empirically assessing textmining systems.

Advances in Knowledge Discovery and Data Mining

Author : Takashi Washio
File Size : 85.13 MB
Format : PDF, ePub
Download : 143
Read : 918
Download »
This book constitutes the refereed proceedings of the 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2008, held in Osaka, Japan, in May 2008. The 37 revised long papers, 40 revised full papers, and 36 revised short papers presented together with 1 keynote talk and 4 invited lectures were carefully reviewed and selected from 312 submissions. The papers present new ideas, original research results, and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition, automatic scientific discovery, data visualization, causal induction, and knowledge-based systems.

Emerging Technologies of Text Mining Techniques and Applications

Author : do Prado, Hercules Antonio
File Size : 41.72 MB
Format : PDF, Kindle
Download : 822
Read : 920
Download »
"This book provides the most recent technical information related to the computational models of the text mining process, discussing techniques within the realms of classification, association analysis, information extraction, and clustering. Offering an innovative approach to the utilization of textual information mining to maximize competitive advantage, it will provide libraries with the defining reference on this topic"--Provided by publisher.


Author : Perry Fairfax Nursey
File Size : 87.87 MB
Format : PDF, Kindle
Download : 299
Read : 1315
Download »

Encyclopedia of Data Warehousing and Mining Second Edition

Author : Wang, John
File Size : 61.83 MB
Format : PDF, ePub, Docs
Download : 411
Read : 436
Download »
There are more than one billion documents on the Web, with the count continually rising at a pace of over one million new documents per day. As information increases, the motivation and interest in data warehousing and mining research and practice remains high in organizational interest. The Encyclopedia of Data Warehousing and Mining, Second Edition, offers thorough exposure to the issues of importance in the rapidly changing field of data warehousing and mining. This essential reference source informs decision makers, problem solvers, and data mining specialists in business, academia, government, and other settings with over 300 entries on theories, methodologies, functionalities, and applications.

Advances in Knowledge Discovery and Data Mining

Author : Jinho Kim
File Size : 66.3 MB
Format : PDF, Kindle
Download : 498
Read : 301
Download »
This two-volume set, LNAI 10234 and 10235, constitutes the thoroughly refereed proceedings of the 21st Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2017, held in Jeju, South Korea, in May 2017. The 129 full papers were carefully reviewed and selected from 458 submissions. They are organized in topical sections named: classification and deep learning; social network and graph mining; privacy-preserving mining and security/risk applications; spatio-temporal and sequential data mining; clustering and anomaly detection; recommender system; feature selection; text and opinion mining; clustering and matrix factorization; dynamic, stream data mining; novel models and algorithms; behavioral data mining; graph clustering and community detection; dimensionality reduction.