[mongodb-dev] Database with query set as case study

Discussion:

Monika Shah

2017-12-15 03:17:17 UTC

I would like to request to provide a database and its associated rich query
set cover large set of complex query.
I need it for understanding query processing and optimization applied for
variety of cases.

--
You received this message because you are subscribed to the Google Groups "mongodb-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-dev+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-dev/7e7578ee-2a02-4fb9-b754-e75ed9cdce75%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

'Kevin Adistambha' via mongodb-dev

2018-01-05 06:08:42 UTC

Permalink

Hi Monika

Could you elaborate on a more specific goal, e.g. what you require, and
what do you want to understand?

Best regards
Kevin

--
You received this message because you are subscribed to the Google Groups "mongodb-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-dev+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-dev/5e07ba73-2996-499c-bf56-9a1e30cfe24d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Monika Shah

2018-01-09 15:39:32 UTC

Permalink

My aim to understand resource consumption by query plan selected by query
optimizer of MongoDB.
For that,

- I want to understand for which query, MongoDB Query
optimization/processing use which scan, index, join, map-reduce, pipeline.
- I also want to understand impact of replication/sharding architecture,
database size, workload over cluster node .

For the same, I would like to use a real case study. i.e dataset, query
set- mongodb query used for each query set

This would be great help from your side.

Post by 'Kevin Adistambha' via mongodb-dev
Hi Monika
Could you elaborate on a more specific goal, e.g. what you require, and
what do you want to understand?
Best regards
Kevin

--
You received this message because you are subscribed to the Google Groups "mongodb-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-dev+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-dev/18889ba8-576c-46c2-96fd-6b6e12d96728%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

'Kevin Adistambha' via mongodb-dev

2018-01-16 23:17:18 UTC

Permalink

Hi Monika

I donât believe there is a canonical dataset for use as a case study, since
a lot of things would depend on specific use case. Typically, itâs best to
create some example data yourself (using something like mgeneratejs
<https://github.com/rueckstiess/mgeneratejs>) and examine how the query
planner reacts to your queries using db.collection.explain()âŠ
<https://docs.mongodb.com/manual/reference/method/cursor.explain/>. The
page Explain Resuts
<https://docs.mongodb.com/manual/reference/explain-results/> describes how
to read the explain() output.

I want to understand for which query, MongoDB Query optimization/processing
use which scan, index, join, map-reduce, pipeline.

If youâre looking to understand the query planner, there are some resources
that might be useful to you:

- Some examples of query optimization can be found in
https://docs.mongodb.com/manual/tutorial/analyze-query-plan/
- A high level description of the query planner can be found in
https://docs.mongodb.com/manual/core/query-plans/
- The source code for the query planner can be found under
https://github.com/mongodb/mongo/tree/master/src/mongo/db/query

I also want to understand impact of replication/sharding architecture,
database size, workload over cluster node .

The method db.collection.explain()âŠ
<https://docs.mongodb.com/manual/reference/method/cursor.explain/> also
works within replica set and sharded cluster. Given the same query, there
should be no difference between the plan for a standalone node and replica
set (since a replica set is supposed to provide you with high
availability). There will be differences between a sharded cluster and
standalone/replica set query plan, due to the fact that there are one
additional layer (the mongos) between the client and the data (the shard
servers). Notably, the plan could be quite different depending on your
shard key.

Another source of information would be running MongoDB with an elevated log
level. The query planner would output more information to the log compared
to the default log level. See db.setLogLeve()
<https://docs.mongodb.com/manual/reference/method/db.setLogLevel/> for more
details.

Best regards
Kevin
â

--
You received this message because you are subscribed to the Google Groups "mongodb-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-dev+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-dev/5a9957ba-e4c9-46db-9617-307c88fded74%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Monika Shah

2018-01-28 11:58:12 UTC

Permalink

Can you suggest some test database/benchmark of real time analytics that
use MongoDB?
For example, traffic analysis by map, wheather analysis etc.

I am interested in data, its associated sample analytical queries. It would
be nice to have information like frequency of each query

On Wednesday, January 17, 2018 at 4:47:18 AM UTC+5:30, Kevin Adistambha

Post by 'Kevin Adistambha' via mongodb-dev
Hi Monika
I donât believe there is a canonical dataset for use as a case study,
since a lot of things would depend on specific use case. Typically, itâs
best to create some example data yourself (using something like
mgeneratejs <https://github.com/rueckstiess/mgeneratejs>) and examine how
the query planner reacts to your queries using db.collection.explain()âŠ
<https://docs.mongodb.com/manual/reference/method/cursor.explain/>. The
page Explain Resuts
<https://docs.mongodb.com/manual/reference/explain-results/> describes
how to read the explain() output.
I want to understand for which query, MongoDB Query
optimization/processing use which scan, index, join, map-reduce, pipeline.
If youâre looking to understand the query planner, there are some
- Some examples of query optimization can be found in
https://docs.mongodb.com/manual/tutorial/analyze-query-plan/
- A high level description of the query planner can be found in
https://docs.mongodb.com/manual/core/query-plans/
- The source code for the query planner can be found under
https://github.com/mongodb/mongo/tree/master/src/mongo/db/query
I also want to understand impact of replication/sharding architecture,
database size, workload over cluster node .
The method db.collection.explain()âŠ
<https://docs.mongodb.com/manual/reference/method/cursor.explain/> also
works within replica set and sharded cluster. Given the same query, there
should be no difference between the plan for a standalone node and replica
set (since a replica set is supposed to provide you with high
availability). There will be differences between a sharded cluster and
standalone/replica set query plan, due to the fact that there are one
additional layer (the mongos) between the client and the data (the shard
servers). Notably, the plan could be quite different depending on your
shard key.
Another source of information would be running MongoDB with an elevated
log level. The query planner would output more information to the log
compared to the default log level. See db.setLogLeve()
<https://docs.mongodb.com/manual/reference/method/db.setLogLevel/> for
more details.
Best regards
Kevin
â

--
You received this message because you are subscribed to the Google Groups "mongodb-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-dev+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-dev.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-dev/b69f64cd-6155-4c8d-a099-b772a1585c77%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.