Deeplearning4j LSTM输出大小

我的情况 – 在输入我有List<List<Float>> (单词表示向量的列表)。 而且 – 从一个序列输出一个Double

所以我建立下一个结构(第一个索引 – 例子编号,第二个句子编号,第三个词向量元素编号): http : //pastebin.com/KGdjwnki

并在输出: http : //pastebin.com/fY8zrxEL

但是,当我将其中一个( http://pastebin.com/wvFFC4Hw )转化为model.output时 – 我得到的矢量[0.25, 0.24, 0.25, 0.25] 0.25,0.24,0.25,0.25 [0.25, 0.24, 0.25, 0.25] ,而不是一个值。

什么可能是错的? 附加的代码(在Kotlin)。 classCount是一个。

 import org.deeplearning4j.nn.multilayer.MultiLayerNetwork import org.deeplearning4j.nn.conf.NeuralNetConfiguration.Builder import org.deeplearning4j.nn.api.OptimizationAlgorithm import org.deeplearning4j.nn.conf.Updater import org.deeplearning4j.nn.weights.WeightInit import org.deeplearning4j.nn.conf.layers.GravesLSTM import org.deeplearning4j.nn.conf.layers.RnnOutputLayer import org.deeplearning4j.nn.conf.BackpropType import org.nd4j.linalg.api.ndarray.INDArray import org.nd4j.linalg.cpu.nativecpu.NDArray import org.nd4j.linalg.indexing.NDArrayIndex import org.nd4j.linalg.factory.Nd4j import org.nd4j.linalg.lossfunctions.LossFunctions import java.util.* class ClassifierNetwork(wordVectorSize: Int, classCount: Int) { data class Dimension(val x: Array<Int>, val y: Array<Int>) val model: MultiLayerNetwork val optimization = OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT val iterations = 1 val learningRate = 0.1 val rmsDecay = 0.95 val seed = 12345 val l2 = 0.001 val weightInit = WeightInit.XAVIER val updater = Updater.RMSPROP val backtropType = BackpropType.TruncatedBPTT val tbpttLength = 50 val epochs = 50 var dimensions = Dimension(intArrayOf(0).toTypedArray(), intArrayOf(0).toTypedArray()) init { val baseConfiguration = Builder().optimizationAlgo(optimization) .iterations(iterations).learningRate(learningRate).rmsDecay(rmsDecay).seed(seed).regularization(true).l2(l2) .weightInit(weightInit).updater(updater) .list() baseConfiguration.layer(0, GravesLSTM.Builder().nIn(wordVectorSize).nOut(64).activation("tanh").build()) baseConfiguration.layer(1, GravesLSTM.Builder().nIn(64).nOut(32).activation("tanh").build()) baseConfiguration.layer(2, GravesLSTM.Builder().nIn(32).nOut(16).activation("tanh").build()) baseConfiguration.layer(3, RnnOutputLayer.Builder().lossFunction(LossFunctions.LossFunction.MCXENT) .activation("softmax").weightInit(WeightInit.XAVIER).nIn(16).nOut(classCount).build()) val cfg = baseConfiguration.build()!! cfg.backpropType = backtropType cfg.tbpttBackLength = tbpttLength cfg.tbpttFwdLength = tbpttLength cfg.isPretrain = false cfg.isBackprop = true model = MultiLayerNetwork(cfg) } private fun dataDimensions(x: List<List<Array<Double>>>, y: List<Array<Double>>): Dimension { assert(x.size == y.size) val exampleCount = x.size assert(x.size > 0) val sentenceLength = x[0].size assert(sentenceLength > 0) val wordVectorLength = x[0][0].size assert(wordVectorLength > 0) val classCount = y[0].size assert(classCount > 0) return Dimension( intArrayOf(exampleCount, wordVectorLength, sentenceLength).toTypedArray(), intArrayOf(exampleCount, classCount).toTypedArray() ) } data class Fits(val x: INDArray, val y: INDArray) private fun fitConversion(x: List<List<Array<Double>>>, y: List<Array<Double>>): Fits { val dim = dataDimensions(x, y) val xItems = ArrayList<INDArray>() for (i in 0..dim.x[0]-1) { val itemList = ArrayList<DoubleArray>(); for (j in 0..dim.x[1]-1) { var rowList = ArrayList<Double>() for (k in 0..dim.x[2]-1) { rowList.add(x[i][k][j]) } itemList.add(rowList.toTypedArray().toDoubleArray()) } xItems.add(Nd4j.create(itemList.toTypedArray())) } val xFits = Nd4j.create(xItems, dim.x.toIntArray(), 'c') val yItems = ArrayList<DoubleArray>(); for (i in 0..y.size-1) { yItems.add(y[i].toDoubleArray()) } val yFits = Nd4j.create(yItems.toTypedArray()) return Fits(xFits, yFits) } private fun error(epoch: Int, x: List<List<Array<Double>>>, y: List<Array<Double>>) { var totalDiff = 0.0 for (i in 0..x.size-1) { val source = x[i] val result = y[i] val realResult = predict(source) var diff = 0.0 for (j in 0..result.size-1) { val elementDiff = result[j] - realResult[j] diff += Math.pow(elementDiff, 2.0) } diff = Math.sqrt(diff) totalDiff += Math.pow(diff, 2.0) } totalDiff = Math.sqrt(totalDiff) print("Epoch ") print(epoch) print(", diff ") println(totalDiff) } fun train(x: List<List<Array<Double>>>, y: List<Array<Double>>) { dimensions = dataDimensions(x, y) val(xFit, yFit) = fitConversion(x, y) for (i in 0..epochs-1) { model.input = xFit model.labels = yFit model.fit() error(i+1, x, y) } } fun predict(x: List<Array<Double>>): Array<Double> { val xList = ArrayList<DoubleArray>(); for (i in 0..dimensions.x[1]-1) { var row = ArrayList<Double>() for (j in 0..dimensions.x[2]-1) { row.add(x[j][i]) } xList.add(row.toDoubleArray()) } val xItem = Nd4j.create(xList.toTypedArray()) val y = model.output(xItem) val result = ArrayList<Double>() return result.toTypedArray() } } 

UPD。 似乎下一个例子有“近”任务,所以稍后我会检查它并发布解决方案: https : //github.com/deeplearning4j/dl4j-0.4-examples/blob/master/dl4j-examples/src/main/java /org/deeplearning4j/examples/recurrent/word2vecsentiment/Word2VecSentimentRNN.java

LSTM输入/输出只能是3级:请参阅: http : //deeplearning4j.org/usingrnns

旁边的建议发表这个在非常活跃的gitter和亚当的暗示检查了伟大的文档,这解释了如何设置输入和输出是3级,我想指出一些其他的东西你的代码,因为我正在努力解决类似的问题:

  • 在examples / recurrent / basic / BasicRNNExample.java中查看基本的例子,在这里你看到RNN你不使用model.output(xItem),而是model.rnnTimeStep(xItem);
  • 与类似的计数你似乎正在执行一个回归,因为这也检查了回归例子在examples / feedforward / regression / RegressionSum.java和文件在这里 ,在这里你看到作为一个激活函数,你应该使用“身份”。 “softmax”实际上将输出标准化为总和(参见词汇表 ),所以如果你只有一个输出,它总是输出1(至少它是为我的问题做的)。
Interesting Posts