Skip to content

Incompatible StreamReceiver output by marc modules due to inconsistent leader handling #454

@TobiasNx

Description

@TobiasNx

While the documentation of encode-marc21 states that it is compatible with the output of handle-marc-xml and decode-marc21, this is not factual due to inconsistent leader handling by decode-marc21, handle-marc-xml, encode-marc21 and encode-marcxml.

e.g.: We cannot transform marc21-> marcxml or the other way around. even marc21 -> marc21 is not so easy. See here This creates the same error as if it would process marc-xml.

Functional review: @TobiasNx
Code review: @blackwinter


Behaviour of Flux-Modules:

decode-marc21
changes the leader to their specific function of the position:
See here

---
leader:
  status: "p"
  type: "a"
  bibliographicLevel: "m"
  typeOfControl: " "
  characterCodingScheme: "a"
  encodingLevel: " "
  catalogingForm: "c"
  multipartLevel: " "
"001": "946638705"
"003": "DE-101"
"005": "20070429135622.0"
"007": "tu"
"008": "960123s2004    gw |||||r|||| 00||||eng  "
"015  ":
  a: "05,A03,2104"

with option emitleaderaswhow="true" the leader-element is an toplevel and sublevel field
See here

---
leader:
  leader: "02602pam a2200529 c 4500"
"001": "946638705"
"003": "DE-101"
"005": "20070429135622.0"
"007": "tu"
"008": "960123s2004    gw |||||r|||| 00||||eng  "
"015  ":
  a: "05,A03,2104"
  z: "96,N47,0454"
  "2": "dnb"
"0167 ":

handle-marc-xml keeps the leader as an own field:
See here:

---
type: "Bibliographic"
leader: "00000naa a2200000uc 4500"
"001": "1106253078"
"003": "DE-101"
"005": "20171202230117.0"
"007": "cr||||||||||||"
"008": "160712s2016    gw |||||o|||| 00||||eng  "
"0167 ":
  "2": "DE-101"
  a: "1106253078"
"022  ":

encode-marcxml can handle the result of decode-marc21(emitleaderaswhole="true") but cannot if the leader is ommited in multiple fields results in leader with multiple fields.

Then re result looks like this:

	<marc:record>
		<marc:leader>p</marc:leader>
		<marc:leader>a</marc:leader>
		<marc:leader>m</marc:leader>
		<marc:leader> </marc:leader>
		<marc:leader>a</marc:leader>
		<marc:leader> </marc:leader>
		<marc:leader>c</marc:leader>
		<marc:leader> </marc:leader>

It seems that there is no control if there is only one leader.


encode-marc21 cannot handle data from handle-marcxml: see

Error is:

org.metafacture.framework.FormatException: invalid tag format for reference field
    at org.metafacture.biblio.iso2709.RecordBuilder.checkValidReferenceFieldTag (RecordBuilder.java:260)
        org.metafacture.biblio.iso2709.RecordBuilder.appendReferenceField (RecordBuilder.java:244)
        org.metafacture.biblio.iso2709.RecordBuilder.appendReferenceField (RecordBuilder.java:224)
        org.metafacture.biblio.marc21.Marc21Encoder.processTopLevelLiteral (Marc21Encoder.java:254)
        org.metafacture.biblio.marc21.Marc21Encoder.literal (Marc21Encoder.java:186)
        org.metafacture.biblio.marc21.MarcXmlHandler.endElement (MarcXmlHandler.java:135)

Also not from decode-marc21(emitleaderaswhole="true") see

The error is:

org.metafacture.framework.FormatException: literal must only contain a single character:leader
    at org.metafacture.biblio.marc21.Marc21Encoder.processLiteralInLeader (Marc21Encoder.java:195)
        org.metafacture.biblio.marc21.Marc21Encoder.literal (Marc21Encoder.java:183)
        org.metafacture.biblio.marc21.Marc21Decoder.emitLeader (Marc21Decoder.java:254)
        org.metafacture.biblio.marc21.Marc21Decoder.process (Marc21Decoder.java:221)
        org.metafacture.biblio.marc21.Marc21Decoder.process (Marc21Decoder.java:136)

So besides inconsistencies it is difficult to transform marc21-> marcxml or the other way around. even marc21 -> marc21 is not so easy. See here This creates the same error as if it would process marc-xml.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions