Implemented where that have individual clause for each row. #1053

crakjie · 2016-11-28T16:52:45Z

Added individual clause to the joins where.
This allow this kind of syntax :

rdd.joinWithCassandraTable(ks, tableName).where("timestampMilis = ?", (k : KVRow) => Seq(k.timestampSecond * 1000))

Of course the syntax un the where can be enhance, it's why it's open to review.

datastax-bot · 2016-11-28T16:52:56Z

Hi @crakjie, thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign Spark Cassandra Connector CLA. It's all electronic and will take just minutes.

datastax-bot · 2016-12-06T17:12:24Z

Thank you @crakjie for signing the Spark Cassandra Connector CLA.

RussellSpitzer · 2016-12-21T01:03:09Z

Can you write a little bit of the use case for this api? I took a brief look today (sorry so busy) and I think it's a very cool Idea but i'm having a hard time thinking about how someone would actually use it?

LukaszZu · 2016-12-22T09:14:57Z

Hmm I think it can do possibility to join as below standard SQL query
select * from tb1 join tb2 on id = id where tb1.eventTime between tb2.from and tb2.to
When I have data as below:

tb1 (RDD)
id|eventTime|others
1|12:30|foo

tb2 (table in Cassandra)
id|from|to|asset name
1|11:10|11:40|Bar
1|11:41|15:30|What Im looking

Now as I know when I do joinwithCassandraTable I will receive
this two rows but after this patch I will receive only 1
For me it will be very nice feature but I don't know if I understood this code correctly.
Please correct me if I'm wrong.

crakjie · 2016-12-22T10:49:25Z

I had this idea because I have to do a join over timestamp but not == timestamp.

The database was contening timestamp older than the left RDD. And each element of the left RDD was containing the information about how old RDD the element have to be joined with. So to do that I had to have an information contained in each left element.

So the general idea was to be able to modify the where close depending on each input elements.

I still don't know if the type of the "fwhere" function is good or if it can be simplified. Actually the function has to return an internal scc object ..

RussellSpitzer · 2016-12-22T20:04:04Z

I'm wondering if we might be better off with just another api, like a generic "RunPreparedStatements"

Which would be something like

RDD[BoundParameters].runPreparedStatements[ReturnType]("CQL HERE with ? PARAMETERS ?")

Of which then the Joins become a child class of?

etspaceman · 2016-12-23T15:12:07Z

@RusselSpitzer +1 to that idea. This would be a really great addition, giving users very strong flexibility on RDD processing.

crakjie · 2017-01-17T12:46:28Z

Back from hollydays.
Why not @RussellSpitzer, but how de validity of the request is made?

RussellSpitzer · 2017-01-17T19:08:59Z

I think we should pause on this and instead focus on making completely flexible function. Like I described above, that way we don't increase the complexity of the code as is and are able to introduce a greater amount of flexibility.

Implemented where that have individual clause for each row.

33834e7

datastax-bot added the cla-missing label Nov 28, 2016

datastax-bot removed the cla-missing label Dec 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented where that have individual clause for each row. #1053

Implemented where that have individual clause for each row. #1053

crakjie commented Nov 28, 2016 •

edited

Loading

datastax-bot commented Nov 28, 2016

datastax-bot commented Dec 6, 2016

RussellSpitzer commented Dec 21, 2016

LukaszZu commented Dec 22, 2016

crakjie commented Dec 22, 2016

RussellSpitzer commented Dec 22, 2016

etspaceman commented Dec 23, 2016

crakjie commented Jan 17, 2017

RussellSpitzer commented Jan 17, 2017

Implemented where that have individual clause for each row. #1053

Are you sure you want to change the base?

Implemented where that have individual clause for each row. #1053

Conversation

crakjie commented Nov 28, 2016 • edited Loading

datastax-bot commented Nov 28, 2016

datastax-bot commented Dec 6, 2016

RussellSpitzer commented Dec 21, 2016

LukaszZu commented Dec 22, 2016

crakjie commented Dec 22, 2016

RussellSpitzer commented Dec 22, 2016

etspaceman commented Dec 23, 2016

crakjie commented Jan 17, 2017

RussellSpitzer commented Jan 17, 2017

crakjie commented Nov 28, 2016 •

edited

Loading