Login  Register

Re: Unicode problem

Posted by papaiking on Jul 13, 2011; 7:39am
URL: http://ngl.70.s1.nabble.com/Unicode-problem-tp6577693p6578051.html

I did it, but the result no change.
I think the problem at:
MarcReader reader = new MarcStreamReader(input);

MARC4j may use iso8859-1 as default instead of UTF-8.
We need to specify UTF-8 when using MarcStreamReader.
Here is my console output in server:

org.marc4j.MarcException: error parsing data field for tag: 245 with data:   aBàn về t�
        at org.marc4j.MarcStreamReader.next(MarcStreamReader.java:220)
        at newgenlib.marccomponent.conversion.Converter.getMarcModelsFromMarc(Converter.java:469)
        at org.verus.ngl.indexing.NewBibliographicSolrIndexCreator.indexingData(NewBibliographicSolrIndexCreator.java:113)
        at eof.techProcessing.BuildIndexingPanel.buildIndex(BuildIndexingPanel.java:216)
        at eof.techProcessing.BuildIndexingPanel$2.construct(BuildIndexingPanel.java:198)
        at tools.SwingWorker$2.run(SwingWorker.java:119)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: subfield not terminated