Module:Citation/CS1/Identifiers: Difference between revisions

Content deleted Content added
update per RfC;
m 1 revision imported
 
(2 intermediate revisions by 2 users not shown)
Line 250:
--[[--------------------------< N O R M A L I Z E _ L C C N >--------------------------------------------------
 
LCCN normalization (httphttps://www.loc.gov/marc/lccn-namespace.html#normalization)
1. Remove all blanks.
2. If there is a forward slash (/) in the string, remove it, and remove all characters to the right of the forward slash.
Line 287:
--[[--------------------------< A R X I V >--------------------------------------------------------------------
 
See: httphttps://arxiv.org/help/arxiv_identifier
 
format and error check arXiv identifier. There are three valid forms of the identifier:
Line 381:
Validates (sort of) and formats a bibcode ID.
 
Format for bibcodes is specified here: httphttps://adsabs.harvard.edu/abs_doc/help_pages/data.html#bibcodes
 
But, this: 2015arXiv151206696F is apparently valid so apparently, the only things that really matter are length, 19 characters
Line 527:
and terminal punctuation may not be technically correct but it appears, that in practice these characters are rarely
if ever used in DOI names.
 
https://www.doi.org/doi_handbook/2_Numbering.html -- 2.2 Syntax of a DOI name
https://www.doi.org/doi_handbook/2_Numbering.html#2.2.2 -- 2.2.2 DOI prefix
 
]]
Line 573 ⟶ 576:
'^[^1-9]%d%d%d$', -- 4 digits without subcode (0xxx); accepts: 1000–9999
'^%d%d%d%d%d%d+', -- 6 or more digits
'^%d%d?%d?$', -- less than 4 digits without subcode (3 digits with subcode is legitimate)
'^%d%d?%.[%d%.]+', -- 1 or 2 digits with subcode
'^5555$', -- test registrant will never resolve
'[^%d%.]', -- any character that isn't a digit or a dot
Line 621 ⟶ 625:
if ever used in HDLs.
 
Query string parameters are named here: httphttps://www.handle.net/proxy_servlet.html. query strings are not displayed
but since '?' is an allowed character in an HDL, '?' followed by one of the query parameters is the only way we
have to detect the query string so that it isn't URL-encoded with the rest of the identifier.
Line 631 ⟶ 635:
local access = options.access;
local handler = options.handler;
local query_params = { -- list of known query parameters from httphttps://www.handle.net/proxy_servlet.html
'noredirect',
'ignore_aliases',
Line 800 ⟶ 804:
 
Determines whether an ISMN string is valid. Similar to ISBN-13, ISMN is 13 digits beginning 979-0-... and uses the
same check digit calculations. See httphttps://www.ismn-international.org/download/Web_ISMN_Users_Manual_2008-6.pdf
section 2, pages 9–12.
 
Line 849 ⟶ 853:
like this:
 
|issn=0819 4327 gives: [httphttps://www.worldcat.org/issn/0819 4327 0819 4327] -- can't have spaces in an external link
This code now prevents that by inserting a hyphen at the ISSN midpoint. It also validates the ISSN for length
Line 953 ⟶ 957:
Format LCCN link and do simple error checking. LCCN is a character string 8-12 characters long. The length of
the LCCN dictates the character type of the first 1-3 characters; the rightmost eight are always digits.
https://oclc-research.github.io/infoURI-Frozen/info-uri.info/info:lccn/reg.html
http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:lccn/
 
length = 8 then all digits