Sequence or sequence-to-sequence prediction with Ludwig? #17

Jari jarirajari · Jun'20

I have a file that contains a sequence of integer numbers as a column like this:

5491180
2223344
0152982
1234567
1894742

I am trying to do (LSTM) sequence or sequence-to-sequence prediction with Ludwig. But it seems that I cannot get model specification right, which causes errors when I try to train my model. Can anyone advice what kind of the minimal model spec (yaml) should look like? I tried to check matching examples from the project with no luck. Thank you!

replies 5
views 2.6K
likes 0

Piero Molino w4nderlust · Jun'20

Please post what you want to achieve, you tried and what errors you got, otherwise it is difficult for anyone to try to help.

Jari jarirajari · Jul'20 Author

Thanks for the reply! So briefly recapping: my data consists of numbers and I am trying to predict next number(s) based on the previous numbers (LSTM). The observed data looks like this:

In my data.csv file I have created sequences (manually) by using concatenating three subsequent observed numbers and then using next observed number in the data as the prediction. Each line is repeats the same procedure, which results the following file content:

numbers,predictions
0000001 0000002 0000003,0000004
0000002 0000003 0000004,0000005
0000003 0000004 0000005,0000006
0000004 0000005 0000006,0000007
0000005 0000006 0000007,0000008
0000006 0000007 0000008,0000009
0000007 0000008 0000009,0000010

My model definition mdf.yaml file content:

input_features:
    -
        name: numbers
        type: sequence
        encoder: cnnrnn
        cell_type: lstm

output_features:
    -
        name: predictions
        type: sequence
        decoder: generator

Running the command

ludwig train -mdf mdf.yaml --data_csv data.csv

Results into a error

--- snip ---
    raise ValueError("Cannot convert an unknown Dimension to a Tensor: %s" % d)
ValueError: Cannot convert an unknown Dimension to a Tensor: ?

Tried looking examples in the guide, but clearly not understanding enough yet...

Piero Molino w4nderlust · Jul'20

1 #4

Hi Jari,

a couple things here:

I should probably make it clear somewhere in the docs, but some sequence models, because of they architectures, work only on sequences longer than a certain amount. The cnnrnn is one of those, because the first cnn layers are there to compress a sequence of inputs but in this case the sequence of inputs is already 4, so the lack of padding in those layers compress too much. You can either fix this by setting the parameters of the cnn layers accordingly (or reducing their filter size), but it's probably a better idea to use a different encoder that does not compress the sequence, for instance an RNN or a ParallelCNN.
you are treating your inputs and your outputs as sequences, meaning they are treated as seuqnces of discrete symbols. That's likely not what you actually want to do, you should treat the input as a timeseries and the output as a numerical feature.

Let me know if this help solving your problem.

Jari jarirajari · Jul'20 Author

Hi Piero, thanks for the help, I really appreciate it. I managed to get forward at least: as you pointed out I didn't fully understood what I was doing. With also the help of https://github.com/uber/ludwig/issues/124 I finally made some progress - and no errors.

I think for the beginners like me it would help to have (to lessen the friction):

"ludwig clean" command that would clean environment and make sure command ran after it would always lead to same result.
have example data in csv format (for some cases). This would help to gain confidence that you are doing the right thing in the beginning, and because it is actually quite easy to have problems with data formatting, structure, and "dimensions"
maybe one built-in example to Docker image (layer): meaning one example that a newbie you could run and see results in less than 10 minutes.

But still ludwig is the only way of really getting into machine learning without extensive knowledge. What a great piece of work! Documentation has improved a lot and Docker image is really handy because installation and upgrade wasn't that always that easy.

Piero Molino w4nderlust · Jul'20

Hi Jari,

glad to hear things are working!
I welcome your suggestions, thay are very valuable to me. I will add them as things to do in this board that tracks how to improve the documentation: https://github.com/uber/ludwig/projects/2

For me to understand better your suggestions, could you please clarify the following points?

I'm not super sure of what do you mean with "clean the environment". Also, commands should already lead to the same results because of the setting of the random seed, if they don't please let me know which command doesn't and I'll investigate why.
in the examples page on the website ( https://ludwig-ai.github.io/ludwig-docs/examples/ ) there are a few examples, some of which have also training data. i guess your suggestions is to provide data for all of them?
there are also a few example notebooks in the examples sections in the repo: https://github.com/uber/ludwig/tree/master/examples The titanic one runs very quickly. But they are not well advertised I guess... so it take as a task to create a notebook for each example and link the notebook in the examples page, how does that sound?