December 4, 2016

RESTful-API documentation

Documentation of the hyScore.io RESTful-API

Please take a look at our Frequently asked question (FAQ) section first, if you’ve a question before contacting our support. Thanks.

Endpoint

Endpoint v1 : api.hyscore.io

Endpoint v2 : api.hyscore.io/v2/

Method

POST

Authorization

Header fieldDescription
x-api-keyAPI-Key (mandatory)
AuthorizationBasic Og==
Content-Typeapplication/x-www-form-urlencoded OR raw

Usage

Input POST body should either be application/x-www-form-urlencoded  or raw JSON. Available fields in POST body:

ParameterDescription
urlFull URL of the article to be analyzed. (e.g. http://domain.com/folder/article001.html)
textRaw text to be analyzed.

Note: You can use the parameter "url" OR "text". A combination of both lead to an error.
numberOfKeywords Number of keywords returned in JSON response. (default number = 5). We recommend a setting of 2 up to 3 keywords, if you want to use the API for use cases like “contextual video” or “contextual advertising”.
uuidType = String. Can be used for tracking purposes. If you want to personalize your offer for specific user or enrich a users data set you can inject here an identifier.
customDataType = String (array). Can contain any custom information, identifier, etc.

getTextType = Boolean (Default=False). Switch specifying (getText=True) if the system should return the extracted text.
getMetaType = Boolean (Default=False). Switch specifying (getMeta=True) if the system should return the extracted meta keywords.
imagedivid
Advice: API v2 only!
Type = String. Instead of letting the system automatically choose the appropriate image, you can enter the id of the image div that should be used.

With the parameter "imagedivid" you can force our API to extract the image URL of a specific image
container.

Note: Takes either url or text as input, not both at the same time.

Response

Response is in JSON format and contains the following fields:

KeyDescription
category
Advice: API v1 only!
The category of the URL/Website.

Response:
“gx_retry" until category is determined.

Note: you can find the full hyScore category list containing the IAB category
mapping and the response status list as *.csv-file below. See "IAB category mapping".


Depricated:
In some cases we maybe respond with "no channels returned” OR “null”. These are "corpses" which will be removed out of our index over time (mainly old URLs first requested in the early phase of hyscore). Its a self-cleaning mechanism.
categories
Advice: API v2 only!
The weighted categories of the URL/Website (max weight: 100). The category with the highest weight is more likely as the others.

Response:
“gx_retry" until category is determined.

Note: you can find the full hyScore category list containing the IAB category
mapping and the response status list as *.csv-file below. See "IAB category mapping".
customDataAn additional The value if the customData input field if used. Any format allowed (numeric, srings), e.g. “23D-XZ-2300” or “user@domain.com”
iabThe category of the URL/Website as IAB category.

In API v2 you get a list of matching IAB categories with a weighted score. The category with the highest weight is more likely as the others.

Official IAB Content taxonomy : category (Tier 1) and IAB code (Tier 1 & 2), example:

API v1:
“iab”:
{
“category”: “Automotive”,
“code”: “IAB2-4”,

API v2:
“iab”:
{
“category”: “Automotive”,
“code”: “IAB2-4”,
"weight": 28.053 }

Response:
- “no channels returned” until category is determined
- "category not supported by IAB" - we were not able to determine a IAB category for the website and its content.
image
Advice: API v2 only!
URL of the article image if available. Either chosen automatically of via the imagedivid - parameter.
languageThe language of the given content/text is determined automatically by the system:

  • “de” : German (DE)

  • “en” : English (EN)

  • “fr” : French (FR)

  • “tr” : Turkish (TR)

  • “ru” : Russian (RU)

  • “es” : Spanish (ES)

  • “pt” : Portuguese (PT)

  • “it” : Italien (IT)

  • “nl” : Dutch (NL)

  • “se” : Swedish (SE)

  • “dk” : Dansk (DK)

  • “hu” : Hungarian (HU)

  • ... more to follow soon.

metaKeywordsThe given meta keywords – if exists – of the URL set by its publisher (not weighted). Has to be activated with “True”.
textThe article text that was extracted and analyzed (if “getText=True” is set). Matches the text input field if it was used.

If the default parameter/value "getText=False" is set you'll get the response "text": "Deactivated. See docs."
text length Example: A Image/picture gallery has often less text and lower keyword scores. An article with more text and much more chars provide a higher ranked keyword score. You can use the text length as an on factor / indicator for your e.g. decision engine (e.g. use cases: recommendation, brand safety, etc.)
tldThe Top Level Domain (tld) of the analyzed URL
urlOrigin / Full URL of the site analyzed. (e.g. http://domain.com/folder/article001.html)
uuidThe value if the uuid input field if used. Any format allowed (numeric, srings), e.g. “23D-XZ-2300” or “user@domain.com”.

Note: this information is just looped through the system. We don't store or cache this information.
weightedKeywordsContains the keywords/entities extracted from the input. Keywords are displayed in their normalized form. Each entry consists of:

  • type: The type of the keyword if possible, e.g. “Keyword, Country, VideoGame, Person, …”

  • name: The actual keyword (normalized).

  • weight: The weight of the keyword. The higher this value is, the more relevant the entry is to the given content (max. weight: 10).

  • frequency: The number of times the entity appears in the text. List ist sorted by weight.

API and Categorization status response(s)

StatusDescription Last modification

2xx

No data yetIf you get this response within the first request we've seen this URL the first time ever. hyScore will analyze the URL/Text as fast as can and provide a proper result due a couple of ms to seconds.

Note: In any case we've no result yet we'll respond with "No data yet" within milliseconds (average = ~50 to 65ms) - system default. If you need faster response get in contact.
never
"category": - Response
gx_adserverThe page is for an adserver iframe or similar, categorisation would be of no value.28.06.17
gx_contentaggregatorUrl is for a site with no content of ist own, typically just a page of links28.06.17
gx_blockedhyScore does not crawl this site, no editorial classification has been assigned (rare). May also mean the specific url contains components that hyScore has blacklisted from crawling.28.06.17
gx_taggedhyScore does not crawl this site, but an editorial classification has been assigned (review)28.06.17
gx_uncrawlableCan not be analysed, probably can not even be accessed, not possible to determine what the context of the site is at all (unusual)28.06.17
gx_badsite_norobotsCan not be analysed due to robots restrictions(rare)28.06.17
gx_baddata(bad Data) hyScore was unable to analyse this page, the url maybe invalid, or may be blacklisted by hyScore or the site could be unresponsive.28.06.17
gx_invalid(invalid) hyScore was unable to analyse this page, the url maybe invalid, or may be blacklisted by hyScore or the site could be unresponsive.28.06.17
gx_retry(retry) hyScore has queued this page for processing as it has not yet been analysed by our categorization. You may be able to retry shortly. Any results returned for this page will be from domain level only.28.06.17
gx_redirected(redirected) hyScore was unable to analyse this page because the site redirected our crawler elsewhere when we tried to visit it, possibly this site requires a login, or otherwise has restricted access.28.06.17
gx_noactions(noactions) hyScore was unable to analyse this page because the site has returned no usable content at all. Possibly the site is having issues or is refusing to serve real requests to our crawler.28.06.17
gx_norobots(norobots) hyScore was unable to analyse the page directly because the site does not allow the crawler access via a robots.txt directive.28.06.17
gx_offline(offline) hyScore was unable to analyse this page as the crawler is currently unable contact the site, the site may be down, blocking the crawler, or have generated too many errors and been temporarily disabled from further crawling. You may be able to retry later.28.06.17
gx_badlanguage(badlanguage) hyScore has analysed the page, however, the language is not or partly supported by your platform28.06.17
gx_notwhitelisted(notwhitelisted) - function / status - deprecated - not in use28.06.17
gx_partial(partial) hyScore was unable to analyse this page because the site has returned little or no usable content at all. Possibly the site is having issues or is refusing to fully serve requests from our crawler.28.06.17
gx_gaveup(gaveup) hyScore was unable to analyse this page because the site is not responding to requests. Possibly the site is having issues or is refusing to serve requests from our crawler.28.06.17
gx_notauthorised(notauthorised) hyScore was unable to analysethispage because the site required a login28.06.17
gx_notfound(notfound) hyScore was unable to analyse this page because the site returned a Not Found (404) error when our crawler tried to visit it.28.06.17
gx_nohost(nohost) hyScore is unable to analyse this page as the site does not seem to currently exist in the global DNS records, possibly this site is private to your network, or there has been a temporary issue with DNS lookups.28.06.17
gx_nourlEither no url was specified at all, or it could not be interpreted as a url at all28.06.17
gx_nomatcheshyScore has analysed this page, but it does not match any channels at all. Possibly the page contains very little content and it is not possible to determine a valid channel at all.28.06.17
gx_unmappedhyScore has analysed this page and determined information for it, however, the mapping does not include any category for this response (unknown category result)28.06.17
gx_notdownloadedhyScore was unable to download and process this page for analysis. This may be an intermittent issue or may indicate there is something in this page that Grapeshot currently cannot handle.28.06.17
"iab": - Response
category not supported by IABExactly what it means...28.06.17

4xx

"message": "Forbidden"No OR wrong API-Keynever

IAB category mapping

The Latest IAB hyScore category mapping file  (*.zip-file) contains the latest CSV-File with hyScore’s category mapping (IAB category mapping).
Last Update : 10th July 2017


Other JSON response example

Example(s):
  • Origin URL: http://www.tripadvisor.com/Attraction_Review-g616016-d8038472-Reviews-Nungwi_Beach-Nungwi_Zanzibar_Island_Zanzibar_Archipelago.html

  • POST:url=http%3A%2F%2Fwww.tripadvisor.com%2FAttraction_Review-g616016-d8038472-Reviews-Nungwi_Beach-Nungwi_Zanzibar_Island_Zanzibar_Archipelago.html&uuid=feedback%40hyscore.io&numberOfKeywords=5


Response example [1]: No data yet (usually the initial and first request of an unknown URL). If we are not able to deliver an answer within 100ms we’ll always send this response. Usually with the 2nd request to the same URL we’ve a proper result.

Response Header:
HTTP/1.1 200 OK

Response Body:
No Data yet




Response example [2]:Website category – no channels returned (if it takes a bit longer to determine the website category/channel). It will be updated as soon we’ve determined the category/website of the channel. Depending on the load, the amount of websites and amount of text lines being parallel analyzed this could take a while. In these cases we always answer a request with “no channels returned”.

Response Header:
HTTP/1.1 200 OK

Response Body:
{
“category”: “gx_retry“,
“iab”: “no channels returned“,
“customData”: “”,
“metaKeywords”: “Nungwi Beach, Nungwi, Zanzibar Island, Zanzibar Archipelago, attractions,
things to do, attraction, opinions, fun, opening, map, reviews, information, guidebooks, advice, popular”,
“uuid”: “feedback@hyscore.io”,
“language”: “en”,
“url”: “https://www.tripadvisor.com/Attraction_Review-g616016-d8038472-Reviews-
Nungwi_Beach-Nungwi_Zanzibar_Island_Zanzibar_Archipelago.html”,
“text”: “If you are looking for a beach with white sand, nice restaurants, chillout sunset bars,
not too crowdy, not too small, with great night dive site, than you’re just about right at the Nungwi Beach.
The life here is veeeeeery slow (pole pole) and will chill you down!”,
"image": "https://media-cdn.tripadvisor.com/media/photo-s/07/dd/91/3e/nungwi-beach.jpg",
“text length”: 2128,
“weightedKeywords”: [
{
“frequency”: 8,
“type”: “Keyword”,
“name”: “Nungwi”,
“weight”: 5.593457943925234
}, …




Response example [3]: Full response – A full response looks like this:

Response Header:
HTTP/1.1 200 OK

Response Body:
{
"weightedKeywords": [
{
"frequency": 9,
"type": "Keyword",
"name": "Nungwi",
"weight": 5.8244444444444445
},
{
"frequency": 3,
"type": "Keyword",
"name": "Snorkeling",
"weight": 3.006666666666667
},
{
"frequency": 1,
"type": "Country",
"name": "Zanzibar",
"weight": 2.008888888888889
},
{
"frequency": 2,
"type": "Keyword",
"name": "Scuba set",
"weight": 2.0044444444444443
},
{
"frequency": 2,
"type": "Keyword",
"name": "Booze cruise",
"weight": 1.7044444444444442
}
],
"text": "Deactivated. See docs.",
"image": "https://media-cdn.tripadvisor.com/media/photo-s/07/dd/91/3e/nungwi-beach.jpg",
"customData": "Add customData here.",
"categories": [
{
"name": "interest_frequent_travelers",
"weight": 43.618
},
{
"name": "interest_online_shoppers",
"weight": 36.203
},
{
"name": "travel",
"weight": 31.348
},
{
"name": "travel_holidays",
"weight": 22.629
},
{
"name": "interest_male",
"weight": 12.339
}
],
"uuid": "contact@hyscore.de",
"language": "en",
"url": "https://www.tripadvisor.com/Attraction_Review-g616016-d8038472-Reviews-
Nungwi_Beach-Nungwi_Zanzibar_Island_Zanzibar_Archipelago.html",
"iab": [
{
"category": "category not supported by IAB"
},
{
"category": "category not supported by IAB"
},
{
"category": "Travel",
"code": "IAB20",
"weight": 31.348
},
{
"category": "Travel",
"code": "IAB20-1",
"weight": 22.629
},
{
"category": "category not supported by IAB"
}
],
"tld": "tripadvisor.com",
"metaKeywords": "Nungwi Beach, Nungwi, Zanzibar Island, Zanzibar Archipelago, attractions,
things to do, attraction, opinions, fun, opening, map, reviews, information, guidebooks, advice, popular, ",
"text length": 2128
}

Last updated : 3rd July 2017