解组期间 xml 字符无效-百家乐凯发k8

问题描述

我正在使用编码utf-8"将对象编组为 xml 文件.它成功生成文件.但是当我尝试将其解组时，出现错误:

i am marshalling objects to xml file using encoding "utf-8". it generates file successfully. but when i try to unmarshal it back, there is an error:

无效的 xml 字符(unicode:0x{2}) 的值中发现属性{1}"且元素为0"

an invalid xml character (unicode: 0x{2}) was found in the value of attribute "{1}" and element is "0"

字符为 0x1a 或 u001a，在 utf-8 中有效但在 xml 中非法.jaxb 中的 marshaller 允许将此字符写入 xml 文件，但 unmarshaller 无法将其解析回来.我尝试使用另一种编码(utf-16、ascii 等)但仍然出错.

the character is 0x1a or u001a, which is valid in utf-8 but illegal in xml. marshaller in jaxb allows writing this character into xml file, but unmarshaller cannot parse it back. i tried to use another encoding (utf-16, ascii, etc) but still error.

常见的百家乐凯发k8的解决方案是在 xml 解析之前删除/替换这个无效字符.但是如果我们需要这个字符，解组后如何得到原来的字符呢?

the common solution is to remove/replace this invalid character before xml parsing. but if we need this character back, how to get the original character after unmarshalling?

在寻找此百家乐凯发k8的解决方案时，我想在解组之前用替代字符(例如点 =.")替换无效字符.

while looking for this solution, i want to replace the invalid characters with a substitute character (for example dot = ".") before unmarshalling.

我已经创建了这个类:

public class invalidxmlcharacterfilterreader extends filterreader {
    public static final char substitute = '.'; 
    public invalidxmlcharacterfilterreader(reader in) {
        super(in);
    }
    @override
    public int read(char[] cbuf, int off, int len) throws ioexception {
        int read = super.read(cbuf, off, len);
        if (read == -1)
            return -1;
        for (int readpos = off; readpos < off   read; readpos  ) {
            if(!isvalid(cbuf[readpos])) {
                   cbuf[readpos] = substitute;
            }
        }
        return readpos - off   1; 
    }
    public boolean isvalid(char c) {
        if((c == 0x9)
                || (c == 0xa) 
                || (c == 0xd) 
                || ((c >= 0x20) && (c <= 0xd7ff)) 
                || ((c >= 0xe000) && (c <= 0xfffd)) 
                || ((c >= 0x10000) && (c <= 0x10ffff)))
        {
            return true;
        } else
            return false;
    }
 }

这就是我读取和解组文件的方式:

then this is how i read and unmarshall the file:

filereader filereader = new filereader(this.getfile());
reader reader = new invalidxmlcharacterfilterreader(filereader);
object o = (object)um.unmarshal(reader);

不知何故，读者不会用我想要的字符替换无效字符.它会导致无法解组的错误 xml 数据.我的 invalidxmlcharacterfilterreader 类有问题吗?

somehow the reader does not replace invalid characters with the character i want. it results a wrong xml data which can't be unmarshalled. is there something wrong with my invalidxmlcharacterfilterreader class?

解组期间 xml 字符无效-百家乐凯发k8

问题描述

推荐答案