What we currently know, we also have from the article of the heise editorial staff. You can read about what happened there. Please follow the linked article above.
An official statement on the part of Microsoft/XANDR is pending, or we do not have one at this point.
In this publicly available Excel file mentioned in the article (now removed), 46 (forty-six) hyScore.io segments are also included by name, consisting of 11 test segments and a number of individually created segments for online advertising agencies and clients.
The file itself lists a total of 651,465 different segments from 94 vendors.
Our actual statement of the case (9th June, 23)
No personal, device or cookie information is recorded, processed, used in segments or required by hyScore.io.
The hyScore segments do not contain any user or device-related data, but consist exclusively of individual article URLs freely accessible to anyone on the Internet, the content of which is analyzed separately using a combination of state-of-the-art methods of natural language processing, image and video analysis and artificial intelligence (contextual analysis.
Depending on the analysis result, the individual article URL qualifies for a “topic field”, a target group description, a persona or a brand and is then combined into suitable segments consisting of placements (URLs). These generic or individual, dynamically created segments using various data points (such as weighted keywords, phrases, categories, sentiment, security, etc.) are made available to advertisers and their service providers in various platforms for the delivery of an advertising campaign for paid use.
Example: Articles and guides published on the Internet on the topic of “Pregnancy” can be assigned to a segment with the topic area “Health”, “Health > Pregnancy” or “Family > Parents”, “Family > Pregnancy” based on the text analysis, depending on the focus of the respective article. The bottom line is that a targeting segment in hyScore corresponds to a URL list of articles on one or more specified topic areas.
The segments mentioned in the “XANDR List” are “Custom Segments” that have been provided to advertisers as curated reach (URL lists). These segments are from 2021 and are no longer active with some exceptions.
hyScore “Custom Segments” are clearly named on connected platforms such as XANDR Curate in order to associate them with a campaign or client. This information, or segment names, are named in plain text for organizational reasons.
We see our task in placing online advertising WITHOUT processing device or user data as minimally invasive and as efficiently as possible on quality environments that are suitable or desired in the context. In doing so, we support the companies and platforms of the online marketing industry.
At hyScore, we act – to our own knowledge – within the framework of the most current national and international data protection rules and laws at all times. We are happy to provide you with information about how we work and will publish further insights into our work in due course. Transparency is important to us!
If you have any questions, please feel free to contact us via the contact forms or the email address contact@hyscore.io.
Was wir aktuell wissen, haben wir ebenfalls aus dem Artikel der heise Redaktion. Sie können sich dort über das Geschehene informieren. Folgen Sie dazu dem verlinkten Artikel.
Eine offizielle Stellungnahme seitens Microsoft/XANDR steht aus, bzw. liegt uns zum jetzigen Zeitpunkt noch nicht vor.
In dieser im Artikel erwähnten, öffentlich zugänglichen Excel-Datei (mittlerweile entfernt), sind auch 46 (sechsundvierzig) hyScore.io Segmente namentlich enthalten, bestehend aus 11 Test-Segmenten und einer Reihe von individuell erstellten Segmenten für Onlinewerbe-Agenturen und Kunden.
Die Datei selbst listet insgesamt 651.465 verschiedene Segmente von 94 Anbietern auf.
Aktuelle Stellungnahme (9. Juni 2023):
Es werden von hyScore.io keine persönlichen, Geräte- oder Cookie-Informationen aufgezeichnet, verarbeitet, in Segmenten verwendet oder benötigt.
Die hyScore-Segmente enthalten keine Nutzer- und Devicebezogenen Daten, sondern bestehen ausschließlich aus einzelnen, im Internet für jedermann frei zugänglichen Artikel-URLs, deren Inhalte separat in Kombination modernster Methoden des Natural Language Processing, der Bild- und Videoanalyse und künstlicher Intelligenz analysiert werden (Contexual Analyse).
Je nach Analyseergebnis qualifiziert sich die einzelne Artikel-URL, für ein “Themenfeld”, eine Zielgruppenbeschreibung, eine Persona oder eine Marke und wird dann zu passenden Segmenten, bestehend aus Placements (URLs), zusammengefasst. Diese generischen oder individuellen, dynamisch erstellten Segmente unter Verwendung verschiedener Datenpunkte (wie z.B. gewichtete Keywords, Phrasen, Kategorien, Sentiment, Sicherheit, etc.) werden Werbetreibenden und deren Dienstleistern in diversen Plattformen für die Auslieferung einer Werbekampagne zur entgeltlichen Nutzung bereitgestellt.
Beispiel: Artikel und Ratgeber die im Internet zum Thema “Schwangerschaft” veröffentlicht werden, können aufgrund der Textanalyse einem Segment mit dem Themengebiet “Gesundheit”, “Gesundheit > Schwangerschaft” oder “Familie > Eltern”, “Familie > Schwangerschaft” zugeordnet werden, je nachdem welcher Schwerpunkt auf dem jeweiligen Artikel liegt. Im Endeffekt entspricht ein Targeting-Segment bei hyScore einer URL-Liste von Artikeln zu einem oder mehreren spezifizierten Themengebieten.
Es handelt sich bei den in der “XANDR Liste” genannten Segmente um “Custom Segmente”, die als kuratierte Reichweite (URL-Listen) Werbetreibenden zu Verfügung gestellt wurden. Diese Segmente sind aus dem Jahre 2021 und bis auf einige Ausnahmen nicht mehr aktiv.
hyScore “Custom-Segmente” werden auf den angeschlossenen Platformen wie z.B. XANDR Curate klar benannt um sie einer Kampagne oder einem Klient zuordnen zu können. Diese Informationen, oder Segmentbezeichnungen, sind aus organisatorischen Gründen im Klartext benannt.
Unsere Aufgabe sehen wir darin Online-Werbung OHNE Verarbeitung von Geräte- oder Nutzerdaten so minimalinvasiv und so effizient wie möglich auf im Kontext passenden oder gewünschten Qualitätsumfeldern zu platzieren. Dabei unterstützen wir die Unternehmen und Plattformen der Online-Marketingindustrie.
Wir bei hyScore handeln – nach eigenem Kenntnisstand – jederzeit im Rahmen der aktuellsten nationalen und internationalen Datenschutzregeln und -gesetze. Wir geben Ihnen gerne Auskunft über unsere Arbeitsweise und werden zu gegebener Zeit weitere Einblicke in unsere Arbeit veröffentlichen. Transparenz ist uns wichtig!
Bei Fragen kontaktieren Sie uns gerne über die Kontakt-Formulare oder die Email-Adresse contact@hyscore.io.
As digitalization progresses, the mountains of data in companies are growing rapidly. However, much of the valuable information is still unused in the form of texts, websites, media files, documents, and e-mails. Numerous innovations in the field of Natural Language Processing (NLP) now make it possible to evaluate this information to a new extent. This leads to immediate information and competitive advantage in many industries. A central building block in NLP is the recognition of semantic concepts in texts – the so-called Named Entity Recognition (NER).
Companies continuously produce text data such as e-mails, work protocols, manuals, patents and much more. Clients produce text data via e-mail, social media channels, questionnaires, reviews, comments, and other sources. Text data comes from different sources, is written by different authors in different languages and often contains spelling mistakes. Companies are making significant efforts to secure this data in so-called “data lakes”. Organizing of this data is often difficult and time-consuming, but automatic text analysis makes this possible.
More efficient than a manual analysis done by humans
Finding relevant content in complex text collections requires new document analysis and search concepts. Common methods, such as searching for certain terms, i.e. the exact matching of letter sequences, prove to be inefficient in times of big data. The manual checking and classification of millions of texts by humans are, in turn, hardly economical and, of course, time-consuming as well. This means that far too much time is wasted on processes that a machine can do faster, better and more precisely.
Get valuable insights out of all your data
Nevertheless, it is extremely important for companies to be able to include all available data for their decisions. In the course of due diligence, for example, a data room comprising several gigabytes would ideally be checked entirely instead of merely selecting just a sample of documents. The same applies, for example, to a very big archive with digitalized texts and documents or to research in an entire online content network in which all articles and URIs (even for x – million entries) can be analyzed in a database.
Thanks to modern techniques such as Named Entity Recognition, large amounts of data can easily be analyzed in a blink of an eye – in real-time or by batch-packets in defined time slots. These processes are working automated 24 hours a day and 365 days a year by using NLP solutions like hyScore|analyze.
In science, the automatic recognition of a real-world object is known as Named Entity Recognition (NER). General objects such as persons, places, and organizations can be recognized, but also specific objects such as aircraft, company, phone, or e.g. cryptocurrency.
Image: Difference between rule-based character search (left) and intelligent detection of entities (right). In the example on the left, the system does not find the character string “UC Berkeley” because it does not occur in the text. In the example on the right, the system recognizes the text section “University of California, Berkeley” as an organization. Similarity measures can be used to link this organization to UC Berkeley University. Furthermore, a rule-based system cannot distinguish between the company or the fruit “Apple”. An intelligent system – like hyScore|analyze can!
The history of the development of NER systems goes back to the early 90’s, but has recently been boosted by the application of deep neural networks. The accuracy of the systems was achieved by two fundamental improvements: firstly, neural networks can include entire sentences or even entire documents in the analysis – older systems, however, were always limited to a few words. On the other hand, the mathematical representation of individual words is much more advanced than before.
Contextual data is data that gives context to a person, entity or event. It is commonly used by business organizations for market research and prediction. Contextual data is taken from various sources and may include business information, family and socioeconomic background, educational history, health background, general environment and many other factors. More definitions of contextual data you’ll find at the end of this article [1].
At hyScore.io we define contextual data as follows: contextual data in our “context” is simply used to know more about the meaning of a website or any provided (plain) text and its content. We structure unstructured data and express in a scored and weighted manner the meaning and most important content of the website/text in keywords and their entity plus a sentiment score which involves evaluating online opinions based on specific words. The sentiment is then judged to be positive, negative or neutral.
Furthermore, we classify and weight the website in an own categories taxonomy and map these directly to the IAB standard taxonomy (Tier 1 / Tier 2). This kind of contextual data is useful for several use cases in many industries.
Contextual Data is about the content and environment of a website/text
Contextual data is that which is delivered to the right person, at the right time, within an actionable context. For example, the user reads an article about renting a Finca in Cala Millor on the island Mallorca in Spain. Wouldn’t it be great to show him a contextual matching video about the island Mallorca, the region Cala Millor or a best practice video of “how to rent a Finca”? Wouldn’t it make sense to show him a contextual advertising of a “Finca rental service” or links to previous articles and user reviews about the topic? If the sentiment of the article is bad, you might show him a video of “hidden traps to rent a Finca in Spain”.
The other way around is to not show something in the context, e.g. for brand safety. As an airline, you might not want to advertise your great deals on trips to New York right next to news about a horrible plane crash.
With hyScore’s contextual data API, you know right at this moment what a user is reading in which environment and you’re able to directly use this information to deliver additional information based on this actionable context or not. If you just want to enrich a users profile (interests, famous topics, etc.) you can do this by simply sending the user identifier with the initial request to our API. We just loop it through and provide you the information what user has read.
hyScore’s definition of contextual data is simple and valuable for many use cases. We don’t build products based on our data by our own. We leave it up to you how you use this kind of data in your business context. You can ennoble the data we provide by using it in your own product, application, and any intended use case. We don’t mind if you use it for content recommendation, site search improvements, tagging, for a contextual video player, contextual advertising, environmental analysis for brand safety, fraud detection or website classification, user profile enrichment, audience and user segmentation purposes, digitalization, research, whatever.
Our mission is to remove the major pain point to get access to this kind of contextual data for you. You need no additional infrastructure, you don’t need computational linguists and natural language processing experts. All you need is just an API-Key to get access to it. Sign up for a free account.