How to improve analytic efficiency? #1773
-
Hello~ I'm having a performance problem. I currently have some modules that heavily rely on the database to parse the select body, such as 'select A=round(case when B>100 then 4 else 5 end,4)'. Therefore, I wanted to use Jsqlparser to replace the database. After implementing a series of Visitors, I found that its parsing efficiency was relatively low, even when using multiple threads, it was still slow. code: public class JSqlParserItem {
private CCJSqlParser parser;
public JSqlParserManager() {
// init
CCJSqlParser parser = new CCJSqlParser(new StreamProvider(new StringReader("select 1")));
try {
// Since the first parsing takes a long time, it is therefore advisable to perform an unnecessary parsing during initialization.
parser.Statement();
} catch (ParserException e){
e.printStackTrace();
}
this.parser = parser;
}
public CCJSqlParser getParser() {
return parser;
}
public void reInit(String sql) {
this.parser.ReInit(new StreamProvider(new StringReader(sql)));
}
}
public class JSqlParserUtils {
public static void main(String[] args) {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
// Regardless of how many threads are used, the parsing speed remains the same.
executor.setCorePoolSize(20);
executor.setMaxPoolSize(30);
executor.setThreadNamePrefix("JSqlParserThread-->");
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.DiscardOldestPolicy());
ArrayBlockingQueue<JSqlParserManager> queue = new ArrayBlockingQueue<>(20);
for (int i = 0; i <20;i++){
JSqlParserManager manager = new JSqlParserManager();
queue.add(manager);
}
// test SQL
List<String> sqlList = genSqlList();
for (int i = 0; i<10;i++) {
exec(sqlList,executor,queue);
}
}
public static void exec(List<String> sqlList,ThreadPoolTaskExecutor executor,ArrayBlockingQueue<JSqlParserManager> queue){
long startTime = System.currentTimeMillis();
for (int i = 0; i < sqlList.size(); i++) {
String drSQL = sql.get(i);
int drIndex = i;
executor.execute(() -> {
JSqlParserManager drQueue = null;
try {
drQueue = queue.take();
drQueue.reInit(drSQL);
Statement statement = drQueue.getParser().Statement();
map.put(drIndex, statement);
} catch (InterruptedException | ParseException e) {
e.printStackTrace();
} finally {
countDownLatch.countDown();
try {
queue.put(drQueue);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
});
}
try {
countDownLatch.await();
} catch (InterruptedException e) {
e.printStackTrace();
}
long endTime = System.currentTimeMillis();
System.out.println( "usedTime:" + (endTime - startTime ));
}
public static List<String> genSqlList(){
List<String> list = new ArrayList<>();
for (int i = 0; i < 100; i++) {
String sql = "select " + getRandomParam() + " = round(isnull(case when " + getRandomParam() + "=1 or " + getRandomParam() + "=3 then " + getRandomParam() + " when " + getRandomParam() + "=10 or " + getRandomParam() + "=11 then " + getRandomParam() + " end,0),4)";
list.add(sql);
}
return list;
}
public static String getRandomParam() {
String[] arr = new String[]{"A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"};
Random random = new Random();
return arr[random.nextInt(26)];
}
} I created 100 random SQL queries for parsing, and I found that the first parsing took about 300ms, while the following ones maintained at around 150ms. However, when I removed the multi-threaded parsing and parsed each query one by one, the total time taken was the same as not using multi-threading. Can you tell me what went wrong? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Greetings! Unfortunately I am not sure if I understand your question. Are you asking, if JSQLParser parses MULTIPLE statements in parallel? Then the answer is: No, multiple statements are parsed in serial strictly. Also, JSQLParser does not support any kind of parallelism. Also, I do not think that JSQLParser is threadsafe. On parsing the first time: Keep in mind that the parser loads a large number of classes into the JVM, so I would naturally expect the first parse to be slower than any following. |
Beta Was this translation helpful? Give feedback.
Thank you for the explanation, I believe that I do now understand better what you ask for:
It's actually an interesting idea although it begs one very important question: how do you split the List of Expression Items?
a) if you know the Expressions beforehand, then yes, I would expect your parallel approach to work
b) if you do not know the Expressions, but depend on parsing the ExpressionList, then this won't work (since JSQLParser parses every element down to the leaves of the AST)
Btw, you can parse
E…