Re: Unicode problem
Posted by papaiking on Jul 13, 2011; 7:39am
URL: http://ngl.70.s1.nabble.com/Unicode-problem-tp6577693p6578051.html
I did it, but the result no change.
I think the problem at:
MarcReader reader = new MarcStreamReader(input);
MARC4j may use iso8859-1 as default instead of UTF-8.
We need to specify UTF-8 when using MarcStreamReader.
Here is my console output in server:
org.marc4j.MarcException: error parsing data field for tag: 245 with data: aBàn về t�
at org.marc4j.MarcStreamReader.next(MarcStreamReader.java:220)
at newgenlib.marccomponent.conversion.Converter.getMarcModelsFromMarc(Converter.java:469)
at org.verus.ngl.indexing.NewBibliographicSolrIndexCreator.indexingData(NewBibliographicSolrIndexCreator.java:113)
at eof.techProcessing.BuildIndexingPanel.buildIndex(BuildIndexingPanel.java:216)
at eof.techProcessing.BuildIndexingPanel$2.construct(BuildIndexingPanel.java:198)
at tools.SwingWorker$2.run(SwingWorker.java:119)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: subfield not terminated