Financial Industry Business Ontology (FIBO)
FIBO is the industry standard and initiatives to define the financial and industrial terms, definitions and synonyms, using the principles of the Semantic Web. This is a joint effort OMG and Enterprise Data Management Council. The project was initiated by direct requests to achieve control of financial data, reporting, notification of important events and so on. The objective is to integrate existing standards in this solution and to also avoid the famous N +1 problem ..
Financial Industry Business Ontology (FIBO) is a model that can be used effectively in the following areas:
• Regulatory Authority
• standardization of data
• Risk Management
• Requirements for Analytics
Facebook announced - Graph search
Facebook has announced a major addition to its social network - smart search engine is called a graph search.
The new tool allows users to "natural" search the content of your friends.
Search terms can include phrases such as "Friends who love Star Wars and Harry Potter."
Founder and CEO Mark Zuckerberg has insisted that this is not a web search and therefore is not a direct challenge to Google.
However, the integration of Microsoft's Bing search engine for situations when the graph is looking for someone he could not find the answers.
Mr. Zuckerberg said he "did not expect" people to start flocking to Facebook to perform a web search.
"It is not the intention," he said. "But in case you can not find what you are looking for, it's nice to have this tool."
Posted by: Dejan Petrovic.
An algorithm to trace the source of crimes, epidemics or rumors
Up one source is often a difficult job, which could be facilitated by a new discovery. A Portuguese researcher at the Ecole Polytechnique Federale de Lausanne (EPFL) has developed a mathematical system to identify the source of information circulating on a network, an epidemic or a terrorist attack, announced Friday, August 10 EPFL. His research is published in the journal Physical Review Letters.
The researcher Petro Pinto, who works for the Audiovisual Communications Laboratory at EPFL, has developed a system "that could prove a valuable ally" for those who must conduct criminal investigations or seeking the origin of information on the Web. "With our method, we can trace the source of all types of information flowing through a network and that by not listening to a small number of members," said Pedro Pinto.
BACK TO THE SOURCE OF INFORMATION OR ILLNESS
For example, it says to be able to find the author of a rumor circulating among five hundred members of the same network, like Facebook, by observing the messages from fifteen to twenty contacts only. "Our algorithm is able to repeat in reverse the path taken by the information, and trace the source," he said. The researcher also tested the system to trace the origin of an infectious disease in South Africa. "By modeling the network traffic, rivers or human transport, we were able to find the place where the first cases are reported," he said.
The researcher has also experienced its system of telephone calls related to preparation of the attacks of September 11, 2001. "By rebuilding the network of terrorists solely on the basis of reports in the press, our system has given us three possible suspects, one of which was the proven leader of these attacks, according to the official investigation." The details of this algorithm are published on Friday.
From Locating the Source of Diffusion in Large-Scale Networks publication:
How can we localize the source of diffusion in a complex network? Due to the tremendous size of many real networks—such
as the Internet or the human social graph—it is usually infeasible to observe the state of all nodes in a network. We show that
it is fundamentally possible to estimate the location of the source from measurements collected by sparsely-placed observers.
We present a strategy that is optimal for arbitrary trees, achieving maximum probability of correct localization. We describe
efficient implementations with complexity O(N), where = 1 for arbitrary trees, and = 3 for arbitrary graphs. In the
context of several case studies, we determine how localization accuracy is affected by various system parameters, including
the structure of the network, the density of observers, and the number of observed cascades.
Pedro Pinto page »
Viralheat Sentiment API Is Getting 300M Calls Per Week
Viralheat is best known for semantic tools, analytics and social media tools for publishing content, but also offers a free API. Particularly interesting for the API is that it analyzes the sentiment of the public and according to CEO Raj Kadam, Viralheat API began to be extensively used by many organizations, companies and individuals. He said that Viralheat originally wanted to use the API for sentiment analysis of other companies, but was not satisfied with what he saw - all the functionality was based on the detection of key words, instead of using natural language processing to get true meaning of the message or tweet comment. So the company decided to make their own tools for the analysis of public sentiment and the tools available to other companies and an organization through the API. One example, to deal with API commented, "sick" means different things in different contexts, and Viralheat can understand the difference when someone talks about "sick jeans" which has a different meaning than someone saying that the "sick in the hospital."
Now Viralheat says that API using 1500 organizations and developers in industries such as finance and academia, and to handle 300 million calls per week. (Each call is a piece of text, such as a tweet, which is being analyzed.) All this makes the use of sentiment analysis Viralheat smarter because the company could be corrected by the analysis, for example, if the API returns the analysis result of neutral Tweet message, but it turned out that the positive.
techcrunch.com » | Viralheat »
Social media could help detect pandemics
A growing segment of the medical community believes that is a realistic possibility and is increasingly looking at ways to harness the power of blogs, news outlets and social-networking websites to detect disease patterns around the world. Dozens of researchers gathered Monday at a pandemic conference in Toronto to hear about the progress one expert has made toward achieving those goals. John Brownstein, an epidemiologist who works as a researcher at Children’s Hospital Boston, told researchers instead of relying solely on government-based disease-surveillance systems, they should recognize the power of clues coming from individuals on the ground.
Dr. Brownstein and his colleagues have created HealthMap, an ambitious website and mobile application that constantly trolls the Internet for emerging outbreaks of the flu or a new respiratory illness. HealthMap uses news sites, eye-witness reports, government disease-tracking systems, wildlife disease-surveillance websites and other sources to identify new patterns in disease and where they are occurring. The scope is impressive; HealthMap automatically scrolls through tens of thousands of websites an hour, Dr. Brownstein said. ‘We’re constantly mining the Web,’ he said. Researchers recently used HealthMap to illustrate on a world map the location of new cases of E. coli infection as they were identified, following a massive illness outbreak that was eventually linked to German sprouts.
HealthMap » | The globe and mail »
Open PHACTS: drug discovery
A new consortium of European organizations uniting to support the next generation of efforts to researches drugs, providing a single view across different data sources, bringing the Semantic Web in the study of drugs.
Open PHACTS (http://www.openphacts.org) is a consortium, funded by the Innovative Medicines Initiative and to reduce barriers to drug discovery using semantic technology over the available resources, data, and creating open space pharmacological.
Currently, pharmaceutical companies invest considerable efforts and doubled the capacity for coordination and integration of internal information and public data sources. This process is largely incompatible with massive computer access and the vast majority of resources for finding drugs can not work together.
Open PHACTS (Open Pharmacological Concepts Triple Store) will hold / give a glance through the available data resources and will be free for users.
Scientific texts are difficult to analyze by computer. The system will have the factual allegations set aside as a semantic triplets, which was first given the opportunity of a query text and database together and put the results to the answers needed to identify new drug targets and pharmacological interactions. While the semantic approach is delivered to a small and targeted cases, so far, the promise of Multiscale data integration remains largely unfulfilled.
Open PHACTS is a large project, which includes many of the top experts in the Semantic Web, and who are committed to fulfill this promise.
Open PHACTS »
Posted by: Dejan Petrovic
National Security and Terrorism
Understanding language, whether it is written, spoken or implied by action, is an essential capability in many analytical systems. Central to understanding language in humans and machines are the areas of computational linguistics and formal semantic representation.
In addition to identifying linguistic patterns and concepts, it is often necessarily to have formal representations of language data for complex reasoning tasks. Concepts like "semantic computing" and "semantic search" refer to computational techniques that use knowledge representation and deep linkage into the referents of information tokens in language (e.g., dictionaries, thesauri and ontologies) and in data resources (e.g., libraries, databases and web-based repositories). Perhaps the best-known sense is in the "semantic web", as described by Berners-Lee, et al. (2001).
Sweto ontology »
Twitter and Stock Market predictions
Traders are constantly looking for ways of action to find out what the market wants that is where he's headed to the small differences that could make. In this sense, traders follow a number of unrelated factors, such as for example weather reports and the like, to increase their chances and accuracy of forecasting market trends.
It is probably not surprising that traders on the stock market today, trying to follow conversations on large social networks such as Twitter in order to obtain information that what stocks are more or less popular.
Does this really work in practice? German PhD student Timm Sprenger, says his analysis shows that the thing works. To put their findings to the test set up a website TweetTrader.net, which monitors the actions and messages to Twitter.
The idea behind Twitter-tracking is that if one million people speak (write) about things he likes or dislikes, no matter whether they are movies, books, food, celebrities, then there is a certain percentage of these messages related to SE.
Since the introduction of "sentiment indicator", which is engaged in many research firm working for the broker market and the compiling economic analysis, theories and comments provided by economists, investors, the owners of shares in real time and the (monitoring and analysis of messages on Twitter) gives similar type of information.
According Sprenger's research, ranking sentiment that his system got from tweets, followed 500 stock index was approximately correct, and the possibility of in to foresee what the market one day in advance. Studies show that investors who have traded on the stock market using Sprenger study had an average profit of 15%.
Last year researchers at the University of Manchester and Indiana University have published another study that gives the predicted value obtained from the research being on the Twitter messages and comparing it with the Dow Jones Industrial Average index, given the accuracy of 87%.
There are similar ideas and business startup projects, such as StockTwits.com and similar.
It should be noted that this approach is not without risk because the Twitter is open system, with which one can play to promote specific actions, similar to what happened on the message boards in the early days of Web-based investment (eg Google search for SEO engineers). Anyone who is planning such an approach to trade stocks must all take this into account.
Posted by: Dejan Petrovic
Automatic analysis of public opinion (analysis of the text) is a new emerging area, which coincides with many other businesses, such as business intelligence, customer services, management of brands, and appeared as a need to measure the market. They use many types of software for the analysis of public opinion, which use technology to analyze text, they get from social media, newspaper articles, internal documents and databases. The market for text analytics can only reach a value of 980 million dollars in 2014. year, starting from 500 million in 2011.
This technology allows almost anyone to analyze the sentiment of consumers regardless of any marketing campaign run by the company.
This market research can provide companies an insight into how a small group of consumers see a product at a time. Sentiment analysis is more like a continuous video recording.
Such a software package takes into account all the conversations about the product, and then searched and analyzed using statistical analysis and so-called natural language processing, and a system for the interpretation of written communication, even with the use of slang. Out to identify consumer sentiment and brand or famous personality, software to predict market behavior. Some brands may have money and if consumers have a negative disposition towards them (eg, Lady Gaga).
Communication with customers
Research the sentiment of consumers toward products and services are vital to improving the quality and satisfied customers (eg, sentiment hotel guests). Sources of data for analysis are numerous, from social networks, blogs, comments and e-mail and database instance- company's CRM system.
You should pay attention to accuracy issues in the sentiment analysis in order to avoid spam and other irrelevant information. It is also necessary to calibrate the tools for measuring mood and adjust over time and depending on the analysis and target groups.
Semantic search in Media
Semantic technologies, or the study of meaning, will play a big role in the further development of knowledge base management for many industries, especially for the media industry.
Semantic search engines represent a wide area and a higher realm of semantic technology, which also includes a knowledge base, finding and interpreting information, and so on. The main goal of semantic search data is to obtain greater accuracy of results, so the system difficult to understand the intentions of the person and contextual relationship between terms used in searches.
All serious web sites that providing news attract more visitors. This is especially true for news organizations and blogs, which produce a large amount of news and comment every day, so the news is not much helpful in this piles because no one will read. So the goal is to make content that can be searched as much as possible.
The main challenge in information retrieval is the structure of the web, which is predominantly written in HTML, which is used to show how it will look like the information and not what it means. As a result we have the information within a web page such as captions, date of publication that are formatted within the HTML but are not explicitly marked. This is a difficulty the rest of the web to understand the nature of structured content. This is because they are formatted web page that people can easily read them, and machines can not easily follow the meaning, if there is no consistent structure.
Many communities on the web are working on this problem, especially Linked Data, which is all the more the center of these efforts. Linked Data is the best example for publishing, sharing and connecting pieces of data, information and knowledge in the Semantic Web.
Social media is another area where Web data looks powerful. For example Twitter, which has 110 million messages per day and 250 million active Facebook users per day, looks like a great platform to advertise a brand.
Users spend more time on social networks, and companies understand the importance of the presence of their brands on social platforms. Analysts have focused on the adaptation of this advertising medium, while companies are trying to measure return on investment - ROI of marketing activities on social networks. In all these activities the analyst tools are of major importance.
Following the trend of last year, many companies have added the analysis of public opinion (sentiment) in the list of must-have features in their tools for monitoring social media.
There are two-way communication of a brand on social platforms, the first when the brands marketing message sent through social channels, the other when users discuss brands and products. As a result there are two main ways in which brands are currently using semantic technology:
- Consumer sentiment analysis (public opinion). Brands (companies) want to know what consumers are talking about them. Using text analysis and a growing number of services that analyze the grammar and find the meaning behind some
sentence that he or she writes about a product. In most cases, this means determining the positive or negative opinions about a product or service. In other advanced cases, is going to find the intention behind some statements (sentences). These services enable brand companies to separate the unimportant events of the social networks of those with the greatest potential gain.
- The consistency of marketing messages. By monitoring the mood of users is obviously important, but there is another application of text analysis in social networks, and this is research into the consistency of marketing messages. Information that is obtained from this analysis are important for determining future brand strategies and messages.
Posted by: Dejan Petrovic
The growth of RDF and microformat Web implementation
Based on the collected data about 12 billion web pages indexed by Yahoo and conducted research at three different times, there are data to suggest that market growth of the Semantic Web, especially when it comes to the number of RDF and microformats implementations.
The data show an increase in the use of RDF up to 510% between March 2009 and October 2010, from 0.6% to 3.6% of web pages that are under research. It should be noted that a large number of web pages on the Web does not contain structured data, which makes these results even more interesting. This shows that the semantic web moves increasing steps forward.
DBPedia, Knowledge base
DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the web. DBpedia allows us to ask sophisticated queries against huge amount of information from Wikipedia and to link the other datasets on the Web with information from Wikipedia. This will certainly provide new ways of using the encyclopedia, as well as new ways of navigation, linking, and generally improving the encyclopedia.
DBpedia as a base of knowledge will play a significant role in improving the quality of information on the web, data search and data integration. Today, most knowledge base covers a specific domain, created by small teams of engineers, and the cost of updating and maintaining these databases are high. At the same time, Wikipedia is growing to proportions when it became the largest base of knowledge(man-made) and maintained and updated by thousands of people. DBpedia project uses this great source of knowledge, by making structured information from Wikipedia and allows access via the Web.
Dbpedia knowledge database currently contains 3.5 million items(ontology things), of which 1.67 million are classified in a consistent ontology, which includes 364 thousand people, 462 thousand seats, 99 thousands of music albums, 54 thousands of movies, video games 17 thousand, 148 thousand organizations, 169 thousand animal and plant species and 5200 diseases. DBpedia describes the 3.5 million claims to 97 languages, 1,850,000 links to images and 5.9 million links to other Web pages, 6.5 million external links to other RDF data sets.
Finally we can say that DBpedia, knowledge base has several advantages over other bases: it covers many domains, it is the result of the general (community) agreement, it automatically evolves to the Wikipedia changed and it is really multilingual.
Applications using DBpedia Knowledge Base can:
- improving data search
- Using the Knowledge Base on our web pages
- applications in mobile and geographically localized applications
- Automatic document classification, annotation and capture Web pages with Wikipedia's content
- creating a multi domain ontologies
- sharing information and knowledge, as the beginning of creating semantic web and linked data ..
- Support input of a content with suggestions arising as a result of this Knowledge Base
Posted by: Dejan Petrovic
Content Industry in 2011.
Our Web sites have long since become an application that constantly improve and evolve. After the "events of the people" (a term similar to the revolution) on the Internet via revolution of social networks, websites and applications are beginning to evolve apart from providing basic and social content. In addition to content and information that governs'm an editor or administrator of the website, the web content may affect visitors by adding your comments or voting for a favorite text, change their order. FB has recently enabled the easy integration of some social services to Web sites, so that the content become composite and multilayered.
Many websites are based on open source solutions for content management. They have part of a web application for the content management, data/content presentation part for visitors and databases with structured data. This approach often input and review content before publication, logging and management rights of access to certain functions, and website content.
On the horizon appears a new open source alternative, and this is the sharing of large amounts of data and content, rather than sharing code, a term known as Open Data. Opening data and content, mixing public and private sources and will have a major impact on the industry of website content, and the way we manage the content. Here we meet with the paradigm of Linked Data, as a concept that seeks to enable open content management and to provide standardization in this area in the future.
In time to come, there will be a large amount of information growth and eighty percent of that information will be unstructured. For example forums, comments on news, blogs and social networks have a big problem to present large amounts of unstructured data that are already in their databases, as appropriate to their visitors. The process by which this content is transformed into intelligent content is called enrichment, a term similar to the metallurgical industry. This transformation can be done by various semantic services, tools, text analysis, tagging and so on. A good example is a Calais service, which draws from the text of the entities related to the facts, events, individuals, geographical terms, etc.
You can see examples of Open Calais service on our site in Our Lab.
We built solution for following content sources:
- semanticweb.rs, b92.net, bbc.co.uk, cnet.com
In the much of web content, there is a need for personalization in the presentation of data. It will certainly be one of the important trends in the future tense. Personalization of content will go in two directions, first to allow manual adjustment and the other will be automatic or implicit in the form of suggestions, depending on the social profile, quiz, voting, entered comments, page views history. First approach will be for traditional, passive reader and second for active participant in the site. In the main, this approach should allow easy creation of access to heterogeneous data sources.
How the information is accumulating, the entropy of content increases, searching information becomes once again a hot topic. Classical search data is integrated into almost any Web site through a CMS. But this search is limited to traditional search and get results by keyword, which is insufficient in light of changes and the amount of data which we are exposed. Trends suggest the semantic context to search by keyword suggestions, which quickly take visitors to the desired content. Much of this we can see the Google search engine and Bing.
Posted by: Dejan Petrovic
Top 10 Semantic Web implementations in 2010.
Each year the independent web portal ReadWriteWeb selects 10 products or projects in several categories. We carry their top five choices for implementation in the field of Semantic Web in 2010. year. All the listed projects (implementations) had a major impact on the Internet in terms of innovation and growth in the number of users. These innovations are mainly grouped behind the brands and organizations such as Facebook, Google, Best Buy, the BBC and Data.gov.uk.
Certainly the biggest news and events in the field of Semantic Web happen in April, when Facebook introduced a new platfrmu and tools to work with their database Open Graph.
Open Graph tool allows website owners to easily integrate their Web pages in a social network.
Holy Grail of Web search technology is the ability to post simple questions in everyday language and get simple answers. In May, Google introduced the Google Squared in addition to receiving the results of web search. This functionality was added to the traditional search results in 2 ways.
The first in the answers to simple questions were added via button Show Sources, which shows us how Google came to the answer.
The other is content with the hand that offers a list of possible related searches kji may be in the domain of our interest, which were obtained using our keyword search
Leading U.S. online retailers Best Buy is another large corporation, following the trends this year and introduced a significant improvement using semantic technologies in terms of visibility and ease of access of goods and services by the buyer. Specifically introduced in the data store name, address, hours, and GEO data (location) using semantic languages RDF, so that their browser is now able to easily identify the data and to be placed in proper context search.
In January, Data.gov.uk has launched a service that allows public data which has a government that available to developers. This service has increased rapidly in contrast to the previously launched U.S. services, so they are now available for 4600 data sets.
This is certainly a remarkable example of how public information held by government can be made useful.
BBC World Cup website
The biggest sporting event this year was the World Cup, which is widely followed in the media.
BBC World Cup 2010 web site was specifically created a service that is reported about the World Cup and also had a semantic tools that create real-time reporting of the event.
This website is owned more than 700 web pages, equipped with semantic technology, enhanced with a complete ontology (concept map), which are automatically create pages based on the meta data.
It was an impressive demonstration of how a large, mainstream website may be enriched structure, knowledge base, interpretations and facts.
More on ReadWriteWeb »
Linked data, few words..
Linked data is a concept that aims to make Web content more connected and oriented according to the data. The term is less rigidly defined than the terms of the semantic web, perhaps we can talk about it as a standard. In the main people working on the Linked data are focused on getting the context of any content, assist in the classification of unique guidelines and references, to improve the experience in using tools of the Semantic Web.
The idea of linked data is the integration of many existing components at the data, platforms and applications. Linked Data is engaged in the following areas of application application:
- Display entities: Define who, what, when and where on the Internet. The entities include the meaning and contain context. Simplest, the entity is a line in the list of statements that are organized by type, such as people, places, products, where each unique.
- Annotation entity: It is the finding and recording the entities if they exist in an unstructured content such as Web pages, blogs or comments on the forums. Here we have several tools such as Facebook OpenGraph, HTML5 Microdata, RDFS and hCard microformat.
- Identification and traceability: The entities best contribute to the semantic ecosystem when they are connected with the URI (Uniform Resource Identifier). URI is an ideal point of connection, identification and access to the entity because it is accessible through lots of photos and readable by computer. Point of connection entities, should provide traceability, properties and information about relationships with other entities.
- Find entities: Some enthusiasts are able to all day annotating content to a level where as understood by people and machines. There is no magical tool that would automatically do it completely, but the new technology and tools to search unstructured data improving all the time. The aim of these techniques is to identify the entities, the identifier of the context and type. It is often combined with techniques developed heuristic approach.
(Example: Named entity recognition - NER)
- Translation and approval entities: A variety of ontologies and / or knowledge base of terminology and their properties set the task to a single entity, for example. Local business can appear in multiple lists of entities. How is the entity URI-defined arms, which is unique, it is the tools the task of search and finding the same entity in different data sets.
- Connections: The entities are part of the story. The real power of the Semantic Web lies in connecting different types of entities, such as employees with companies, politicians with donors, brands with stores and so on. Graphs, network entities, ie. how the entities are connected, give the true meaning of the semantic web.
Linked data »
Posted by: Dejan Petrovic
HTML5 and semantics
These days we have seen that Microsoft leaves Silverlight, a platform that was supposed to be the basis for creating rich web applications and flash-in contender.
MS chose the HTML5 standard that is in the process of verification of the W3C consortium as a platform to develop its tools for creating web applications ..
HTML 5 for a face to face semanticweb
As far as the semantic world HTML5 brings something of semantic structures in mind the following block elements:
Read more »
Posted by: Dejan Petrovic
Royal Society Web Science meeting
It was one of a series of seminars, organized by the Royal Society of Science, as part of the celebration of 350 anniversary of the society. It was an academic group with the participation of the stars and the scientific community web Tim Berners-Lee.
Web science is a new discipline of science, has recently been established and defined by the 2006th. Initial activities of academics regarding the use of science in the broadest sense, the use of analytical and mathematical models to understand the mechanics of the Internet, the web as engineering design, shape and structure of the web, social models that govern the Internet, the diversity and dynamic nature of links and web content.
Important field of action is "collective intelligence", which best reflect Winipedia and Galaxy Zoo. Science on the Web encourages transparency in government and citizen contacts with legislators, using the Web.
Read more »
Posted by: Dejan Petrovic
Semantic Technologies Into Your Enterprise
Over thirty companies that embody semantic technologies are routinely featured in surveys of the enterprise search landscape. But dozens more contribute semantic software solutions in the broader information marketplace, and they are largely unknown to the average knowledge worker. Because these semantic tools are not familiar to IT and business managers, they are underutilized where opportunities for major enterprise semantic search improvements could be made.
Read more »
Gilbane Group »
Source: Lynda Moulton, Outsell’s Gilbane Group
Live matrix, the guide to scheduled live online events
LIVE MATRIX is the guide to scheduled live online events from video concerts to private auctions to gaming contests.Of course, Live Matrix isn’t just about video events – in fact, Spivack says he believes the service now provides the largest and most comprehensive schedule of private sales on the web.
Sanjay Reddy, co-founder and CEO of Live Matrix, said:
The vision begins with a question: It would be incredibly difficult to navigate 1,000 channels on digital television without an interactive program guide to tell you when programming is on, so how can you navigate a ton more "channels," across a lot more than video, on the Web without a guide?
Traditional search engines index the space component of the Web and tell you what is where, while Live Matrix is indexing the time dimension of the Web — by telling you What's When on the Web.
The need for a guide to scheduled online events is obvious and the timing is perfect given the level of broadband penetration, the declining cost of bandwidth, the growth of broadband connected devices and the millions of daily scheduled events online.
Nova Spivack, co-founder and President of Live Matrix, said:
The ‘aha moment’ came when I realized that there are huge numbers of live scheduled events happening online every week, but there is no way to find out about them efficiently.”
The Live Web is quite different from TV. Many live events on the Web are interactive and participatory — for example live chats, sales and auctions, game tournaments, and events in virtual worlds. In order to participate fully, or in some cases, at all, in these events, you have to be there when they start. And this requires a way for people to find out about these events in advance – there needs to be some kind of a schedule of upcoming online events. In addition, some kinds of online events are highly perishable, their content gets less relevant over time -- for example, breaking news, sports highlights, product launches or announcements all get stale after a while — so finding out about them in a timely fashion is imperative. There are also many live online events, such as big online concerts or major sporting events, where sharing the experience with other people is a key element; watching archived recordings of such events after they are over is not the same as attending and Twittering about them live with your friends.
Posted by: Dejan Petrovic
In 2003, when the World Wide Web Consortium was working toward the ratification of the Recommendations for the Semantic Web languages RDF model, RDFS,OWL1 and OWL2, we realized that there was a need for an industrial-level introductory view of these technologies. The standards were technically sound, but, as is typically the case with standards documents, they were written with technical completeness in mind rather than education. We realized that for this technology to take off, people other than mathematicians and logicians would have to learn the basics of semantic modeling.
Introducing semantics to the Internet will lead to a new generation of services based on the content (web, documents, etc.).
Search engines will understand the theme and concept searches. Queries will be expressed on the semantic level defining topics. Something of which we can see the Google search engine and Bing.
RDF » | RDF Schema (RDFS) » | N-triples » | OWL 1.1 » | OWL 2 »
Posted by: Dejan Petrovic
The Semantic Web is an evolving development of the World Wide Web in which the meaning (semantics) of information and services on the web is defined, making it possible for the web to "understand" and satisfy the requests of people and machines to use the web content.
Read more »
History and Context
As early as the 1980s significant research appeared in information science literature about the development of expert systems for improving search results.Hundreds of universities, start-up companies, and major corporations have published research and filed patents on various algorithmic techniques for machine-aided searching over three decades (and earlier when much of this work was classified as artificial intelligence). By the late 1990s and early 2000s, these technologies began to be described as semantic search components.In 2001 Tim Berners-Lee published an article in Scientific American proposing a semantic web evolving out of the expanding worldwide web.
Read more »
Categories of Semantic Technology
Text mining and text analytics – Analysis of data from a text that aims to obtain quality and structured information, as well as creating new meanings and information. Sources for the analysis can be text from everyday speech and data applications (email, RSS, CAD, database, etc.).
Concept and entity extraction – Obtaining information from structured or unstructured text, which meet some predefined criteria. The data obtained are a variety of topics or special entities such as names, places, companies, organizations, etc.. which are classified for further use.
Concept analysis – Obtained by processing the concept in order to seek relationships with other concepts. Links with other concepts are defined by terms, by the concept of context and content.
Natural language processing (NLP) – Automated application of the results of concept analysis to determine the meaning of human articulated assertions or queries using computational linguistics.
Content data normalizing – Converting barely structured content into a structured, formalized and standardized format of the content.
Federating and de-duplicating – The procedure that applies to any content that is in our field of interest, which is indexed with the aim of every item is currently in the same format, which completely eliminates the same results. This procedure allows an organization similar results in an easily understandable form for easy analysis and assessment.
Opinion mining(Sentiment analysis) – It is the process of determining attitudes and opinions of the speaker or writer's analysis of the text using the natural language processing, computational linguistics and text in relation to a theme and / or concept.
Auto-categorization – Involves the application of conceptual analysis, ontology and vocabulary, with the aim of organizing content by category, topic and entity.
Posted by: Dejan Petrovic
- Search and creation standardized information about genes, such as gene Ontology (http://www.geneontology.org/). The information used in the study human genome, disease and genetics in general.
- Search and obtaining information from millions of documents, which are used in research and discovery of new drugs
- Normalisation, integration and analysis of patient data with the aim of early detection, prediction and prevention of disease
- Query the ordinary, everyday language with the aim of search millions of documents related to medical and bio research, and for information on drugs, procedures and indications.
- Intelligence, text analysis, early warning from social networks,forums, comments, normalization entities through multilingual website
- Integration of public data through a semantic grouping of the government agency websites
- Integration of data related to criminal justice, normalization and merging of search results data through different court jurisdictions
- Merge the search results data through local governments and their autocategorization by topics
- Automatic notification of users on various topics and conditions, for example on tenders and public procurement.
- Isolation and analysis of concepts in order to improve indexing and search results are obtained by user relevance
- Auto-categorization and classification of the various topics and areas to facilitate easier to find the desired publication and the publisher better sales and new markets
- Auto suggesting book titles that may interest the customer, based on the creation of concepts in real time
- Query the ordinary, everyday language by the user
- Automatic checks of the fit into the designed standards and quality
- Semantic data search and Auto-categorization data is embedded in many of the tools of today's software industry
- Integration of code from various sources
- For large a software firm that link multiple projects simultaneously, have built a unique search solutions to different storage
- Ontological driven project management with a large number of development code.
- Public opinion polls in real time, using the news that can affect trading and stock performance
- Finding and normalization of data on transactions undertaken with a variety of systems that belong to the banks in the merged operations
- Semantic processing of data with the aim of integrating a number of different systems for complex banking events (eg wire transfer)
- Language analysis, eliminating irrelevant and categorization of relevant content and materials for the preparation of cases
- Separation of the entities and conceptual analysis aimed at normalization of subject indexing of documents, memos and e-mail
- Mergers and normalization of legal and other documents in order to prepare case
- Data mining and auto-categorizing catalogs of multiple parts, equipment, and materials manufacturers
- Automatic topical tagging and annotation of documentation for complex, large-scale manufacturing
- Ontology of aircraft geometric and structural components to maintain impact awareness of changes
- Entity extraction and normalization to integrate geographical physical information with mobile search
- Natural language processing support for customer self-service on mobile devices
- Mining and normalizing basin data results from disparate sources to federate results for analysis
- News feed monitoring to discover trends and new entities for competitive intelligence
- Federating content across global operations for strategic business analysis and improvement
- Concept discovery and topical categorization of large domains for scholarly research support
- Federation of public content search results with academic resources
- Ontology use for unifying educational resources across collaborating campuses or institution
- Improving search and navigation and knowledge base of various materials, as well as increased sales arrangements that may be related.