python - theano.scan: Non-unit value on shape on a broadcastable dimension -
i developing simple program uses theano.scan
function loop through array of vectors (in end, intend develop lstm layer program). problem error non-unit on shape of broadcastable dimension
when compiling function. believe updates
param cause because long put there compile function, error occur. here code:
import theano import theano.tensor t utils import * import numpy np class lstm: def __init__(self, x, in_size, out_size): self.x = x self.in_size = in_size self.out_size = out_size self.w_x = init_weights((in_size, out_size), "w_x") def _active(x, pre_h): x = t.reshape(x, (1, in_size)) pre_h = t.dot(x, self.w_x) return pre_h h, updates = theano.scan(_active, sequences=x, outputs_info = [t.alloc(floatx(0.), 1, out_size)]) self.activation = h if __name__ == "__main__": x = t.matrix('x') in_size = 2 out_size = 4 lstm = lstm(x, in_size, out_size) value = lstm.activation cost = t.mean(value) params = [lstm.w_x] updates = [] p in params: gp = t.grad(cost, p) updates.append((p, p - 0.1*gp)) f = theano.function([x], outputs = cost, updates=updates) test = f(np.random.rand(10, in_size)) print test
in code, use functions loaded utils.py
, this:
#pylint: skip-file import numpy np import theano import theano.tensor t def floatx(x): return np.asarray(x, dtype=theano.config.floatx) def init_weights(shape, name): return theano.shared(floatx(np.random.randn(*shape) * 0.1), name) def init_gradws(shape, name): return theano.shared(floatx(np.zeros(shape)), name) def init_bias(size, name): return theano.shared(floatx(np.zeros((size,))), name)
i have been searching while not find solution problem. in addition, can not see problem code. if not use theano.scan
, code running fine.
can see problem in code? have advice solve problem?
thank in advance
Comments
Post a Comment