Graphviz图片显示中文乱码问题 - HelloWorld开发者社区

1. 报错详情¶

现象：graph.view()展示的图形显示中文为乱码。

In [40]:

from sklearn import tree
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
wine = load_wine()
Xtrain, Xtest, Ytrain, Ytest = train_test_split(wine.data,wine.target,test_size=0.3)
clf = tree.DecisionTreeClassifier(criterion="entropy")
clf = clf.fit(Xtrain, Ytrain)
score = clf.score(Xtest, Ytest) 
feature_name = ['酒精','苹果酸','灰','灰的碱性','镁','总酚','类黄酮','非黄烷类酚类','花青素','颜色强度','色调','od280/od315稀释葡萄酒','脯氨酸']

In [41]:

import graphviz
dot_data = tree.export_graphviz(clf
                               ,out_file = None
                               ,feature_names = feature_name
                               ,class_names=["琴酒","雪莉","贝尔摩德"]
                               ,filled=True   
                               ,rounded=True  

)
graph = graphviz.Source(dot_data)
graph.view()

Out[41]:

'Source.gv.pdf'

2 解决原理¶

修改编码方式为UTF-8，替换字体为仿宋。

3 解决方案¶

（1）解决方法一:

In [42]:

import graphviz
dot_data = tree.export_graphviz(clf
                               ,out_file = 'tree.dot'
                               ,feature_names = feature_name
                               ,class_names=["琴酒","雪莉","贝尔摩德"]
                               ,filled=True   
                               ,rounded=True  

)
with open("tree.dot",encoding='utf-8') as f:
    dot_graph = f.read()
graph=graphviz.Source(dot_graph.replace("helvetica","FangSong"))
graph.view()

Out[42]:

'Source.gv.pdf'

（2）解决方法二:

In [43]:

import graphviz
dot_data = tree.export_graphviz(clf
                               ,out_file = None
                               ,feature_names = feature_name
                               ,class_names=["琴酒","雪莉","贝尔摩德"]
                               ,filled=True   
                               ,rounded=True  

)
graph = graphviz.Source(dot_data.replace("helvetica","FangSong").encode(encoding='utf-8'))
graph.view()

Out[43]: